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Foreword 

The seven papers in this volume were originally presented at 
the second of two summer conferences on “Mathematical Applica- 
tions in Political Science” held at Southern Methodist University, 
July 19-29, 1964* and July 18-August 7, 1965. To further the iiiter- 
disciplinaiy design of the second conference, foiu: papers were 
solicited from political scientists and three from scholars in other 
disciplines. Contributors from political science are Hayward R. 
Alker, Jr. and Richard L. Merritt of Yale University and Gerald 
Kramer and William H. Riker of the University of Rochester. Other 
contributors are Otto A. Davis (economist) and Melvin Hinich 
(statistician), Carnegie Institute of Technology, Carl F. Kossack 
(computer statistician) formerly of the Graduate Research Center 
of the Southwest and now of the University of Georgia, and Frank 
S. Scalora (mathematieiM) IBM-World Trade Corporation. 

These conferences, sponsored by the NATIONAL SCIENCE 
FOUNDATION, were conceived to assist political scientists in 
learning how mathematical applications may be effectively utilized 
in their discipline. The meetings were designed to afford oppor- 
tunities for the presentation of techniques and models involving 
statistical and mathematical applications and for high level dis- 
cussions devoted to determination of the limits and validity of 
these relatively advanced concepts as utilized in political science. 


* The two conferences were conceived and directed by Joseph Laurence Bernd. For 
the four published papers of the 1964 conference, see John M. Claiinch (ed.), Maihe’^ 
matted Applications in Political Science (Dallas: Arnold Foundation Monographs: 
Southern Methodist University, 196S), Contributors are: Harold D. Guetzkow, William 
H. Riker, Donald E. Stokes, and S. Sidney Ulmer. 
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1. Solutions to Methodological 
Problems 

Introductory Note 

It would be surprising if the use of inatherriatics in any new field 
were ^ectacularly successful and encompassing from the outset, 

—Oskar Morgensterh 

In its inception, a few devotees of the new science of politics 
appear to have assumed that the magic of numbers, like Athene, 
springing full grown from the head of Zeus, would solve all 
problems of measurement, causation and correlative relationship. 
(These persons might have observed that in economics and, psy- 
chology subjective hypothesizing is still in order, although the 
subjective element has been reduced and precision enhanced 
through the use of mathematics.) The leap of faith, substituting 
mathematics and mechaiiistic models for pld^ dogmas as the 
objects of faith, implies a misunderstanding of the significance of 
their enterprise. Mathematical models, properly employed, offer 
the advantages of precision in definition, identification, and com- 
munication. They are not the be-all and the end-all of scientific 
inquiry. Subjective human agency is still relevant and essential m 
conceiving and formulating, identifying and analyzing, but in cer- 
tain important aspects of the process tMs agency and its attendant 
biases may be reduced or removed. 

Given the exorbitant expectations of a few bemused devotees, 
groping for the essentials of the new science, it is not surprising 
that other scholars, initially dubious, accepted the contrast between 
optimistic expectations and subsequent paltry achievements as 
conclusive evidence that the entire operation was a hopeless and 
permanent failure. Mathematics^ applications to politics (the 
horseless carriage of the sociS sciences) is merely a fad (and wiU 
never replace Old Dobbin), they declared. In fairness it must be 
conceded that some of the studies (certainly not all, or most) seem 
to justify harsh conclusions: superficial ecological correlations, 
addition of the npnadditive, models requiring unobtainable data, 
or other equally slipshod procedures were exhibited. 

It is easy to write off and consign to oblivion a new system, if 
one bases his conclusion on the obsolescent data of the earliest, 
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and often halting and confused, period of development. This 
fallacy of premature rejection is readily detectable when hind- 
sight is applied to most human endeavors. Imagine what a re- 
viewer might say, for instance, in freshly applying today’s sophisti- 
cated philosophical and methodological standards to the works of 
Kepler or Bodin. Yet these early scholars, with all of their limits 
and defects, were obviously important precursors of future con- 
tributions to the understan^ng of the cosmos and of man. 

The French military command in the thirties was correct, on 
the basis of the available data of 1917 and 1918, in dismissing the 
strategic value of aircraft and in doubting the ability of the tank 
to pierce strong, static fortifications. Their monumental miscalcu- 
lations were rooted in a reliance on obsolescent data, in a failure 
to anticipate technical improvements in the design of aircraft and 
tanks, and in their neglecting to keep pace with the effective inte- 
gration of these weapons into offensive military systems. Is it pos- 
sible that some critics of the new political science are prone to rely 
on obsolescent data and hyperbolic claims, drawn from the earliest 
stage and the least worthy disciples of the school? 

Andrew Hacker, in calling for the abandonment of ‘‘the hope that 
political analysis can be either objective or scientific,” may be 
correct, if “objectivity” and “science” are defined in the narrowest 
sense. Hacker’s fallacy, in calling for a return to “subjectivity” 
(Does he mean to glorify the concept and rest contented with the 
shopworn status quo ante?) may be due to a failure to perceive 
how scientific processes evolve. Let us suggest an alternative to 
Hacker’s policy, more realistic in the light of scholarly history and 
current developments:^ 

Let each student of politics follow the bent of his own tastes. 
Some will wish to remain subjective, including those who despair 
of being otherwise, or even those who prefer to be as subjective 
as possible. Others with the training and inclination will wish to 
join the quest for means of limiting subjectivity in the study of 
politics, rather than exulting in it. William H, Hiker, observing 
the progress of economics, as an empirical science, one hundred 
and twenty years after the birth of Alfred Marshall, suggests that 

^ See Andrew Hacker, **Mathematics and Political Science” in James C. Charles- 
worth (ed.), Mathematics and the Social Sciences (Philadelphia: The American Acade- 
my of Political and Social Science, 1963), pp. 58-76. 

A thoughtful and piercing review of Hacker’s article has been written by Arthur 
S. Goldberg. See American Political Science Retdew, Vol. LVIII, 3 (September, 1964), 
pp. 684-685. 
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the example of economics is relevant. ‘‘[It] is somewhat premature 
to forego the scientific enterprise [in studying politics].” 

The article by Hayward R. Alker, Jr. in this volume may well 
be an example of the kind of scholarship which enables a youthful 
scientific school to rise above the level of the fumbling and in- 
choate. The article reveals its author^ s skill in mathematics and 




statistics, as well as in political science, and he faces squarely the 
diflScult questions which arise from both directions. Moreover, he 
brings to the study a refreshing awareness of relevant literature in 
several sister social sciences. It can scarcely be charged that the 
paper addresses itself to the trivial. The centrality of the question 
of causation in empirical social science research is obvious. 

No doubt more will need to be said about the techniques of 
discovering and analyzing casual inference, but this article, in 
analyzing hierarchical and reciprocal concepts of causation, has 
achieved a maturity of temper and a sureness in handling in- 
tricacies which deserve emulation. 



Content analysis is a technique for developing systematic infor- 
mation about a body of raw data— a newspaper, for example— in 
order to derive useful inferences about the values and perceptions 
of those who produced the raw data or those who were influenced 
by it. In a very loose sense, of course, historians have been en- 
gaged in content analysis since the first document was examined 
mid the first inference was drawn. The term "content analysis,” 
therefore, must be defined more precisely as a disciplined and 
quantitative study of contextual frequencies and associations, 
sometimes coded along attitudinal dimensions (such as "good- j 
bad,” "strong-weak,” etc.). As developed by Lasswell, Pool, Stone, | 
Merritt, and others, the systematized study and analysis of content 
has become scientifically precise by contrast widi the largely 
intuitive and impressionistic procedures of the traditional historian. 

Yet the problems of inference remain severe. Quantitative data 
drawn from the echtorials of the New York Times, the Frankfurter 
AUgemeine Zeitung, or Pravda, for instance, are not self-evident 
indications of the values and perceptions of the publishers, the 
editors, or editorial writers, nor of the readership. Nor can we 
assume automatically that our analysis tells us what values the 
writers aimed to communicate to the readership. Editors do not 
customarily derive their editorials from quantitative models. Even 
if they employed the same model as the analyzers, there is no | 
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assurance that their interpretation of findings would be the same 
as that of the analyzers. 

Despite problems of this nature, content analysis is a very 
necessary enterprise and one promising valuable returns. It is ex- 
ceptionally important for informing policy-makers under modem 
conditions of international politics. To be sure, it would appear 
dispensable, if Dr. George Gallup and the Michigan Survey Re- 
search Center were accorded free access to opinion leaders in the 
Peoples^ Republic of China or the U.S.S.R., or if the psychoanalyst 
who serves the editor of Pravda were in the pay of the C.I.A. This 
kind of information might appear most reliable, but we should 
want to check it through the use of other indicators, including 
those produced by content analytic techniques. 

Since our survey researchers and psychoanalysts do not operate 
freely behind the Iron Curtain, and since the measurement and 
analysis of attitudes, perceptions, and values are delicate problems 
under the best of circumstances, content analysis remains a vital 
key to understanding opinion leaders and publics—past, present, 
and future. Its considerable importance, therefore, makes it vastly 
important that the enterprise be subjected to tough-minded scm- 
tiny. This is the service performed by Richard Merritt in the 
article which follows. Merritt appraises his own field critically, 
and he oflFers thoughtful suggestions to guide further study. 



Causal Infei*euce and 
Political Analysis* 

HAYWARD R. ALKER, JR. 

Yah University 

If we can define the causal relation, we can define influence, power, 
or authority, and vice versa. 

^Herbert Simop^ 

If power is the ability more or less coercively to get people 
to do things that they otherwise would not do, exercising power 
is a special case of causation. Authoritative decision-makers legiti- 
mately cause the well-being or deference accorded to some mem- 
bers of a society to increase and the vahxe positions of others to 
diminish. Thus political analysis, as we usually define it, may be 
thought of as the study of the processes and outcomes of authorita- 
tive and coercive social causation.® The causal agents range from 
individual citizens to national governments and intemationd or- 
ganizations; the arenas of their interaction include local communi- 
ties, states, nations, and international societies. 

Despite the centrality of causal inferences in political analysis, 
there has been a noticeable reluctance among political scientists 
explicitly to use causal language. Scholars would rather study 
“iiifluence,” or "power,” or "decision-making,” or "functional rela- 
tionships,” or "communication systems” than causal relationships 
per se, even though each of these concepts implies some kind of 
causal dependence of policy outcomes on decision-makers placed 
in varying sociocultural and political contexts. 

A number of reasons may be offered for this reluctance. In 
academic discussions philosophical objections are frequently men- 
tioned. Hume was the first but by no means the last skeptical 

* In the preparation of this paper I was greatly aided by the thoughtful questions 
and computational assistance of Ronald Brunner. Hubert Blalock has offered a number 
of helpful suggestions. This research has been supported in part by the Yale Computer 
Center, The Yale Political Data Program, and Northwestem University’s project oh the 
Simulation of International Processes financed by JWGA/ARPA/NU (Advanced 
Research Projects Agency, SD 260). 

^ See reference 4^ p. S in the Bibliography of this paper. 

* The necessity and propriety of distinguishing social causation from merely physical 
or biological causation has been argued at some length by Sorokin and Maclver. (See 
references 40 and 46). While accepting the reality of physical and biological deter~ 
minism, this conc^t implies the necessity of including human perspectives and activities 
in our explanations of social, and in particular political, phenomena. 
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philosopher to note that we observe repeated associations rather 
than causal relationships. The meaning of causality to many such 
skeptics remains unclear. Operational methods for establishing 
causal relationships seem to be largely unknown. 

Going beyond the objections of positivistic philosophers, per- 
ceptive students of political behavior have variously emphasized 
that politics involves reciprocal relationships between representa- 
tives and their constituencies, anticipated reactions of the strong 
and the weak, functional exchanges of leadership and support, and 
even negative feedback from the forces of nature to our political 
helmsmen. These scholars do not so much object to the use of 
causal language as doubt that causal theories about the complex 
interrelationships of political life can be either explicitly stated or 
empirically tested in a satisfactory fashion. 

In addition to positivistic skeptics and doubting behavioralists, 
the critics of causal reasoning also include moralistic humanists. 
Doctrines of mechanistic causation and historical determinism are 
rejected as violating a fundamental belief in the freedom of the 
will. Even if the physical world is strictly determined, man’s 
nature requires him to be both free and morally if not causally 
responsible for his actions. 

Among the social sciences, political science has been especially 
sensitive to the complexities of human behavior and to the respon- 
sibilities of moral choice. In rejecting doctrines of economic de- 
terminism, class warfare, or psychological behaviorism, many po- 
litical analysts have failed to leam of increasingly sophisticated and 
less objectionable treatments of the causal inference problem that 
have recently been proposed by economists, sociologists, psy- 
chologists, and other social scientists. This paper will review some 
of these developments, paying particular attention to the above- 
mentioned problems of operationally defining and testing causal 
relationships, modelling and testing reciprocal interactions, and 
somehow accommodating the doctrines of determinism and free 
will. 


I. DEFnsrmoNS of the Causal Relation 

Objections to the Humean "constant conjunction” definition of 
causality can usually be interpreted as implying this definition’s 
incompleteness rather than its incorrectness. After a brief review of 
some additions to and modifications of the Humean position, we 
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shall present and discuss alternative mathematical treatments of 
the causal relation particularly appropriate for the social sciences. 

A. Components of the Causal Concept, 

Asymmetry, Perhaps the most fundamental implication of the 
Humean viewpoint is the asymmetry of the causal relation, A 
sergeants command causes a private's response. The temporal 
asymmetry in this causal relation is clear as that between lightning 
and thunder— the cause comes before the effect. Also implied is 
a unidirectional relationship; if somehow we can get sergeants to 
issue certain commands, their privates will obey, but not vice 
versa. The temporal and/or directional asymmetry of causation 
has been widely emphasized in the writings of philosophers, 
statisticians, and social scientists.® It allows us to think of causal 
chains (47), (28), causal paths (53), causal funnels (11), causal 
hierarchies (22, 45), and even, in the more metaphysical formula- 
tions, of first causes and unmoved movers. 

Political reality obviously includes a number of symmetrical, re- 
ciprocal influence relationships: for example, bargaining, the ex- 
change of leadership for support, and arms races. If these can be 
studied from a causal point of view, a major diflSculty in applying 
causal inference techniques to political phenomena will be over- 
come, Fortunately, several procedures have been developed for 
formalizing and testing causal models involving reciprocal relation- 
ships. A number of appropriate labels have been suggested: causal 
circles (52), reciprocal interaction (55), interdependent systems 
(29, 34), deviation amplifying feedback (41), and mutual causal 
processes (17), Since much of the rest of this paper will be de- 
voted to illustrating some of these concepts, we only note here that 
all of these modelling techniques use the asymmetry idea in model- 
ling reciprocal relationships, with or without assuming time lags 
between the initial and the feedback links. 

Contiguity, In addition to the asymmetry of the causality con- 
cept, a good deal of the relevant philosophical and social science 
literature stresses the contiguity of cause and effect. Physicists 
have long talked about the idea of ‘*no action at a distance" (44, 

*For instance, see references 7, 12, 16, 21, 29, 33, 44, Simon's notion of ’'unilateral 
couplings” and "causal orderings” among variables {43, Part I) correspond very closely, 
for example, to Deutsch’s methods for establishing hierarchies in communication sys- 
tems: "For our studies of communication . . . we might be very interested in getting 
operational tests for what is subordinate, what is coordinate, what is entirely separate. 
The test would be which feedback is coupled to the other asymmetrically” {IB), 
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Chapter 4). Social scieatists, at least those in the Lewinian tradi- 
tion, have stressed that in order for sociological variables to cause 
behavior, they must enter into the psychological field of the indi- 
vidual (11, 46). Both natural and social scientists have rightly in- 
sisted that completely adequate explanations should specify the 
mechanisms linking causes to their effects, (e.g., reference 28), 

Although there are difficulties with the contiguity assumption- 
contact between cause and effect seems always to be instantaneous 
—this emphasis has engendered a number of enlightening theories 
about specific links or pathways between natural or social causes 
and their effects. In an important paper. Miller and Stokes, for 
example, have compared the relative importance of congressmen's 
own beliefs about and perceptions of consistency opinions as in- 
fluences on roll-call voting behavior (42). At several points below 
we also shall discuss the relative importance of alternative causal 
paths and mechanisms linking causes such as constituency opinions 
to their effects in the political arena. 

Lawfulness. Herbert Feigl has suggested a “purified” definition 
of causality in terms of “predictability according to a set of laws” 
(21, p. 408). Even allowing for the uniqueness in some ways of 
every event, this definition makes explicit the need in causal in- 
ferences for comparable cases, multiple observations, and empirical 
generalizations. The existence of causal relations is in this sense 
nearly identical with the assumption— either metaphysical or 
methodological— of the uniformity of nature (see 32, Chapter 14 
and the discussion of J. S. Mill in 44, Chapter 4). This definition 
thus makes more explicit the “theory-laden” nature of simple causal 
statements (28); sergeants" commands are obeyed because army 
traditions of authority are strong and because privates usually find 
the cost of noncompliance to be too high, etc. A number of other 
possibly different influences on behavior are also assumed to apply 
whenever we make even a simple causal argument. 

This emphasis on empirical lawfulness helps to harmonize the 
mindlessness of the “constant or probable conjunction” view with 
more voluntaristic and humanistic outlooks.^ In a recent critique of 
the Humean constant conjunction position, a British political phi- 
losopher, Alasdair MacIntyre, has argued: 

^ The Aristotelean view of causal laws can also be interpreted in both of these ways. 
For a statistical political example see the discussion and references in ( 2 , pp. B'S). 
The distinguishing of causal explanations from teleological, predictive, functional, and 
j^enetic ones will not be further undertaken here. (See, however, 32 and 44 ). 
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. * . to look for tlie antecedents of an action is not to search for an in- 
variant causal connection, but to look for the available alternatives %nd 
to ask why the agent actualized one rather than another. . . . The ex- 
planation of a choice between alternatives is a matter of making clear 
what the agent’s criterion was and why he made use of this crite|ipn 
rather than another and to explain why the use of this criterion appears 
rational to those who invoke it. (39, p. 61). 

To reconcile these two points of view we need first to stress that 
causal laws are not logically necessary or invarient but rather 
empirically observed constant conjunctions, such as commands and 
actions; secondly, we need to discover repeatedly invoked de^- 
cisional criteria explaining observed responses. Some, but not all, 
causal arguments about human behavior do give such additional 
explanations in terms of expectations of undesired punishment, etc. 
Thirdly, we need to specify the historical context, constant on 
changing, within which such generalizations are expected to hold, 
Some modes of causal explanation only implicitly explain why 
certain criteria of choice are used. In regression-like causal models, 
for example, undetermined coefficients represent choices of a par- 
ticular criterion of action— each nonzero coefficient suggests the 
relevance of a particular variable, but the unspecified magnitude 
of the coefficient indicates a “degr^ of freedom^^ in the model. 
Voluntaristic and teleological explanations which plague many 
physical scientists (see 21, 40), are both relevant and necessary to 
complete such explanations. 

Determinativeness. A fourth connotation of the causation con- 
cept is frequently mentioned in the social science literature. ITermi- 
nology such as ‘mdependent’’ and “dependent” variables (33), one 
variable “forcing” or “producing” changes in another (7), or the: 
“manipulative” or “operational” significance of a causal equatiori 
{45) all strongly imply the determinativeness of the cause on the? 
effect. In perhaps the most sophisticated recent statement of this 
point of view. Wold has suggested treating causal equations as 
“autonomous behavioral relationships’’ for different groups of actor s 
in the economy (e.g., producers or consumers), giving the value 
of the “response” variable “conditionally expected” on the basis of 
known values of the “stimulus” variables and a ceteris 'parihiL 
assumption about variables not explicitly included in the equation 
being discussed (52). This dependence of effect on cause is. always 
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seen as more than merely a statistical or logical asymmetric rela- 
tionship. (6, 49, 5iy 

Either out of modesty about attempting to mention all the causes 
involved or because of a basic belief that reality contains both 
deterministic and chance relationships, most physicists and social 
scientists no longer attempt to employ completely deterministic 
models. This liberation of causal thinl^g from an oversimplified 
and strongly mechanistic assumption has probably benefited social 
scientists even more than physicists. Multivariate (i.e., many- 
variabled) and stochastic (i.e., probabilistic) theories (see 7, 14, 
34) have replaced many of the deterministic, single cause theories 
of the past (e.g., crude Marxism or Freudianism). As a result, in 
the mathematical formalizations of causal theory discussed below, 
the use of probabilistic or random terms will play an important 
part. So will the more modest goal of multiple causal explanations. 

Ceteris paribus. Both the determinativeness and lawfulness of 
causal relations require a tentativeness about causal inferences that 
is sometimes overlooked. As empirical generalizations they can 
always be proved wrong by a suflBcient number of counter- 
examples. Tentativeness concerning one's conclusions is also re- 
quired because at some point one most assume that possibly con- 
founding variables have been adequately controlled for. All con- 
crete statements of causal relationships more or less explicitly 
make ceteris paribus assumptions.® An important consequence of 
this necessity is the need for caution about the extent to which 


* Statisticians, for example, would distinguish sharply between a correlational and 
a causal interpretation of statistical coefficients like Goodman and Kruskal*s tau or 
the slope of a linear regression. Both coefficients are asymmetrically, but not usually 
causally, interpreted. In addition, the causal viewpoint, unlike the merely "predictive” 
one, requires special attention to errors in the independent variables. (See 3^, Chapter 
29), Philosophers, on the other hand, are speaking both determinatively and asym- 
metrically when they refer to causes as important sufficient conditions. (See 44, p. ^59 

f). 

^ Besides the Wold usage illustrated above, definitions of the causal relation by 
Lazarsfeld, Blalock, and Simon have all stressed the tentativeness and falsifiability of 
causal claims in view of the ceteris paribus assumptions involved in concrete causal 
inference procedures. Thus Lazarsfeld, referring to universal and partial covariances be- 
tween two attributes X and Y within subcategories of a test factor C, suggests that 
"if we have a relationship between X and Y, and if for any antecedent test factor 
[C], the partial relationships between X and Y do not disappear, then the original re- 
lationship should be called a causal one.” (36, p. 146) Similarly, after assuming that 
**all other variables explicitly included in the causal model have been controlled or do 
not vary,” Blalock says that "X is a direct cause of Y if and only if change in X 
produces a change in the mean value of Y.” The list of other variables being controlled 
is limited to those explicitly included in the model, but Blalock must also assume that 
"the mean change in Y, for a given change in X is the same as the change that would 
always occur if all outside influences could be rigidly controlled.” (7, p. 19) 
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model outcomes may be due to violations of th^e assumptions. 

Historically, Ronald Fisheris work on randomizatipn^^^^^^^^^ 
has provided a number of systematic methods for experimentally 
isolating particular explanatory relationships. More recent work by 
sociologists and economists has focused of being more 

sure about or rel^ng the ceteris paribus assumptions themselves 
(5; 7, Chapter 5; 29; 52). Various of these techniques will be ex- 
amined in some detail below. 

B. Mathernatizing the Causal Relation. 

When dealing with highly general scientific concepts, it helps to 
state them in a precise, abstract fashion capable of both logical 
and (when given a particular interpretation) empirical investiga- 
tion. A more immediate reason for using mathematical formaliza- 
tions of the causal relation is the erient to ,w^ formula- 

tions, from Aristotle on down, can better be understood by doing 
so. It then becomes clever, for example, that logical necessity 
applies to the mathematical models rather J^an to^^ A^^ 
represent. Finally, the equation systems to be studied below m^e 
parsimonious, yet often plausible and testible assumptions about 
the asymmetry, contiguity, lawfulness, determinativeness, and 
ceteris parifews aspects of causal statements. 

Following the econometric tradition ( especially Wold, 52 ) we 
shall assume ( 1 ) that the causal laws relating members of a par- 
ticular population can at least approximately be represented by 
linear, additive equations;^ (2) that the philosophical belief in 
probabilism or partial determinism is adequately expressed by in- 
troducing ‘random*" or “residuar terns in these equati^^ which 
themselves may be considered as empirically based generalizations 
about human behavior;* (3) that ceteris paribus assumptions will 
be stated as assumptions about these “randbm terms”; (4) that in 
each equation it will be possible to distinguish “independent"’ 

^Threshold phenomena and multiplicative relationships are discussed in I, 8, 
14, 1^. Linear additive or multiplicative models with discrete time subscripts and ran- 
dom terms are very similar to some of the di^erential equation models used by Rapo- 
port, Coleman, and others. They differ, however, to the extent that they include (i) 
only finite differences in variable ydues and change rates (rather than infinitesmal ones 
as in the calculus); and (2) random or error term? whose effects ciui more realistically 
change and cumulate from one time peri<xl to another. 

^Economists often include in their models equations not susceptible to bejhiavipral 
interpretations. The complicating implications of including definitional relations, etc., 
are discussed fully in (29, 34, SO). 
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from “dependent’' variables,® with particular coeflScients indicating 
the magnitude of each causal link involved; and (5) that temporal 
asymmetries may be indicated by t subscripts on variables for the 
different times at which they occur. 

Linear causal systems. With these assumptions, we can repre- 
sent any system of causally interrelated variables by a set of linear 
equations like Equations (la). These equations are assumed to 
apply to all N members of some specified population: 


Xi ai2 X 2 -f- ... -f- aiG Xg ••• “f" biH Zh — Ui 

a2i Xi -|- X 2 “I" ••• H” ^2g Xg “h b2i Zi -j- b22 Z 2 4” ••• 4“ h2H Zh = U. 


’ ; (la) 

aGl Xl 4~ ^G 2 X 2 4“ ••• 4" Xg 4^ boi Zl 4“ hG 2 Z24“ ••• 4" hGH Zh = Ug 

Notationally, there are G endogenous (mutually dependent) vari- 
ables, denoted by X’s, and H exogenous (independent or prede- 
termined) variables, denoted by Z’s. It would be possible to have 
various time subscripts on the Z’s; if desired the Z’s could even be 
lagged values of the X’s, as when Zi = Xi(t-i), Z 2 = X 2 (t-i), etc. We 
shall assume that there are G equations, each containing at least 
one endogenous variable as well as other variables causally influ- 
encing the endogenous ones, in particular a random term U and 
possibly other exogenous variables. For simplicity of exposition, we 
shall assume aU the X’s, Zs, and C7’s to have expected (mean) 
values of zero, and set the coefiflcient for one distinct endogenous 
variable in each equation equal to unity.^® 

Alternative simplifications. In causal systems like (la) each en- 
dogenous variable is determined by all other endogenous vari- 

* In the econometric tradition, one may classify variables either in a single equation 
or in a set of equations according to this kind of distinction. For a system of equations, 
mutually dependent variables are called “endogenous” while those assumed to cause, but 
not to be caused by, the endogenous variables are known as '^predetermined” or 
“exogenous” variables. It is sometimes useful to think of exogenous variables as those 
that could be experimentally controlled, along with the residual terms, and the en- 
dogenous variables as those whose resulting variation we wish to examine. 

Using boldface letters to indicate matrices, we can succintly represent the causal 
system (la) either by the matrix equation (lb): 

A.X BZ ~ XJ (lb) 

or the even more compact form of equation (Ic); 

C Y = U . 

In these equations A and B are G x G and G x H coefficient matrices; X, 2, and U 
are all G x 1 column vectors but also could be written as G x N matrices. C is a 
G X G + H matrix composite of A and B while Y similarly contains G -f- H vari- 
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ables, all exogenous variables, and a random or residual term. If 
we are to assume the specific values of the as, Vs and iTs, to be 
unknown, we caimot estimate the magnitude of these terms with- 
out further assumptions relating either to the U*s or to the a’s and 
Vs. In designing models that are testable and whose parameters 
can be uniquely estimated, a number of further simplifications 
must therefore be made. 

It turns out that choosing among alternative simplifications of 
linear stochastic systems involves us in a number of the conceptual 
controversies mentioned previously. The linear stochastic systems 
approach not only calls attention to these issues— some contro- 
versial decisions will have to be made before the model itself can 
be tested and its coeflBcients identified— it also allows us to state 
and argue the issues involved in a logical and empirical fashion. 

These alternative choices of model-building assumptions can be 
arrayed along a number of dimensions. Specifically, for any multi- 
equation linear stochastic causal system, we must decide between: 

1) Hierarchical versus circular causation. The basic idea of 
hierarchical causal relationships is that there exists a ranking of 
endogenous variables defined only in terms of other endogenous 
variables on which they are unilaterally dependent, and in terms 
of exogenous variables. In such a system the highest rank would 
be given to the first causes, the ‘unmoved movers” that depend 
only on exogenous variables and residual terms. Lower ranked 
endogenous variables are assumed to depend unilaterally only on 
higher ranked endogenous variables, exogenous variables, and 
residual terms. Because each variable is thus defined recursively 
(i.e., in terms of previously definable, higher ranked variables), 
such a set of equations is known as a recursive system. Because 
only unilateral dependencies occur in each equation of a recursive 
system (the dependent variable never “causes” the independent 
one) it is possible in an unambiguous way to estimate the coeflS- 
cients of such an equation without taking other equations into 
account (see 51, Chapter 2). 

In fact, this simple decomposability of recursive systems into 
the behavioral regularities of autonomous (but partly determined) 
variables or actions is a major reason for the attractiveness of such 
models— it allows us to think about autonomous actors, and simply 
to describe their behavior. Another reason for the attractiveness 
of such models is their testability. Making additional assumptions 
about the residual terms, one can derive from competing recursive 
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models a number of empirical predictions on the basis of which 
one model can be chosen over another. This procedure will be 
illustrated in Section II below. Recursive, hierarchical models even 
allow for feedback relationships among endogenous variables if 
we assume such interactions take time. Then the endogenous 
variable being fedback also enters into our equations in an earlier, 
exogenous form. 

If we feel reality or the approximate data that we can get from 
reality includes instantaneous feedback relationships, then a hier- 
archical rankings of endogenous variables is no longer possible. In 
this case all endogenous variables are not unilaterally dependent 
on logically prior variables. Estimating the coeflScients of a single 
equation in such a relationship without taking the additional 
circularities into account gives us an incorrect picture of even the 
unidirection causal relationships that it contains ( a mathematical 
illustration of this point based on work by Haavelmo is given by 
Valavanis in SO, Chapter 4). 

2) Incomplete versus complete causal specification. We can 
build models with or without exogenous variables. Recall that 
circular causal models appear to some as more realistic and to 
others as a practical convenience because good time specific data 
is not available. Similarly, the use of exogenous variables is thought 
by some to be more realistic and by others to be an unnecessary 
complication. 

From a theoretical standpoint treating both endogenous and 
exogenous variables as “independent” causes in a recursive system 
seems an unnatural complication. Not identifying the causes of 
the exogenous variables also smacks of incompleteness. There are 
a number of reasons for including exogenous variables, however. 
For nonrecursive models, they can give us sufiBciently distinct 
equations so that we somehow grasp (“control” or “manipulate” 
in a quasi-experimental sense) each equation separately and 
uniquely identify its coefficients using multiequation estimation 
techniques (50, Chapters 6, 9). For recursive models, they can 
give us a larger number of prediction for testing the models (as 
in 9 and 51). For either recursive or non-recursive systems, by 
forcing us to make ourselves somewhat more specific about the 
other factors influencing the endogenous variables, we are reducing 
our dependence on possibly unrealistic assumptions about the resi- 
dual terms. 

3) Uncorrelated versus correlated residual terms. If residual 
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terms are uncorrelated, the causd system is assumed to be isolated 
to (he extent that no outside variable effects more than one en- 

. . ■ ■■ ■■ fi- - ■■ 

dogenous variable. As implied above, including exogenous vari- 
ables can help make the “uncorrelated residuals” assumption more 
realistic. Fortunately the uncorrelated residuals assumption is a 
testable one— as the methods for choosing among causd models, 
noted below, illustrate. 

If we drop the uncorrelated errors assumption, we are in a 
certain sense agreeing that causal systems cannot be as nearly 
isolated from reality as the uncorrelated residuals assumption 
implies. Imbedding our models to the extent that other variables | 
might be assumed to cause residual changes in several of tbe en- 
dogenous variables is undoubtedly more realistic. Unfortunately, 
however, not assuming uncorrelated residuals mean that it is even 
harder uniquely to identify the equations of causal models.^ 

4) Static verms dynamic models. Whether or not we include 
explicit temporal asymmetries in our causal models is another 
major model building choice, capable of either philosophical or 
empirical argument. Econometricians more than sociologists, for 
example, have studied time-lagged relationships, probably because 
meaningful time units (e.gi, fiscal years) and data have been more 
readily available. (Compare 29 and 34 with 7, 10, 14 and 42). Two 
other related issues are also involved: the problem of making longi- 
tudinally valid (time series) predictions from cross-section^ly 
gathered (simultaneous) data and the extent to which statistical 
models assume or imply social systems to be in equilibrium^^® 

A basic problem is how to interpret the coeflScients or their esti- 
mates obtained from simultaneous models in which no time sub- 
scripts explicitly occur. Coleman argues, for example (14), that 
assumptions about uncorrelated residuals and unchanging model 
coefficients are in effect equivalent to assumptions that processes 
underlying simultaneous data are in equilibrium with each other. 
He also has shown that coefficients from cross-sectional analyses 

Some of Wol4*s most interesting work in implicit causal systems (52) concerns 
recursive models with correlated residual terras. Some such models give predictions 
equivalent to certain non-hierarchical systems. In order to make his models giVe de- 
terminate predictions, however, Wold has either to introduce exogenous terms or fail to 
identify some of the coefficients he employs. See also (55, 55). 

The cross-sectional versus longitudinal inference problem is discussed in a number 
of places. See, for example, (2, Chapter 5 and the references) and (50, Section 12.17). 
Coleman’s work on the equilibrium interpretation of causal models is outstanding ( 14 , j 
Part II, and 15) and so is the wealth of material in the econometric literature (52, ! 
Section 2, and references). 
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have a simple longitudinal interpretation only when variable 
change rates are either negligible or constant (15). Perhaps the 
best we can say is that simultaneous results are more nearly 
causally interpretable in a longitudinal sense when re-equilibiiating 
tendencies are quick and variable change rates are slow.^® 

Mixing simplifying assumptions. As implied above, different 
simplifying assumptions tend to go together. At one “extreme” 
there are those model builders (e.g., Blalock) who tend to use 
hierarchical causal relationships, no exogenous variables, uncoire- 
lated error terms and temporal equilibrium; at the other “extreme” 
are those (e.g., Koopmans) who tend to assume circular causal 
relationships and correlated residual terms and use exogenous 
variables and time lagged equations. Obviously there are no 
simple ordering principles that will explain why the four major 
issues in model building have tended to produce these two ex- 
tremes. 

A certain underlying set of attitudes about the decomposability 
or the interdependence of social reality seems, however, to exist. 
The hierarchical modellers act as if they believe reality to be more 
decomposable. This would mean that unilateral dependencies are 
adequate for describing causal relationships in which changes are 
slow or negligible and that causal systems can be satisfactorily 
isolated from their environments. If the uncorrelated residuals 
assumption is not too bothersome, then we need not introduce 
exogenous variables. 

The modellers of reciprocal, interdependent systems, on the 
other hand, may see reality as less decomposable and more dy- 
namic. Simultaneous reciprocal dependencies would then seem 
natural, as well as outside influences on several of the endogenous 
variables. Exogenous variables, in this view, are necessary to help 
identify the coefiBcients of the reciprocal relationships among the 
dependent variables, as well as to reduce difficulties created by 
variables affecting the residual teims. 

Perhaps a better organizing principle would be that the ‘hier- 
archical” modellers are at an earlier stage in theory-building. Bla- 


Econometricians usually make a number o£ additional assumptions that are also 
controversial, but will not be discussed here: (J) that the residual U*s have quasi- 
random, i.e. joint normal distributions, with zero means; (6) that the l/’s are assumed 
not to be autocorrelated (i.e., correlated with previous values of themselves); (7) 
residual terms are also assumed to be uncorrelated with the explicit endogenous variables. 
(See 50, pp. 77 - 79 ), These assumptions are most useful when trying to estimate popu- 
lation parameters from sample statistics. 
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lock might argue, for example, that current data and theories don't 
suggest or allow more realistic and more complex, yet falsifiable, 
models. Certainly the statistics of recursive systems (least-squares 
analysis as compared to maximum likelihood methods) is much 
simpler than that of interdependent ones. This developmental 
perspective would also explain the greater concern with testing 
alternative theories to be found in the "liierarchicar literature and 
the increased attention to parameter estimation techniques by the 
modellers of interdependent systems. 

Fortunately, a number of modelling approaches fall in between 
these two extremes. Blalock has recently discussed nonlinear 
models and is interested in dynamic systems (S); Boudon has 
recently dealt with parameter estimation problems in hierarchical, 
static models with correlated residual terms (10); Wold has Intro- 
duced the idea of implicit causal models that are dynamic, iallow 
correlated residual terms, but maintain hierarchical relationships 
among equations denoting behavioral regularities of autonomous 
actors (52). 

II. Hierarchical Causal Relationships 

We shall illustrate both the simplicity and the attractiveness of 
the hierarchical modelling approach with several models drawn 
from Daniel Lerner s classic work on political development, The 
Passing of Traditional Society (37), Each of the linear stochastic 
models discussed will assume unilateral causal dependencies (re- 
cursive relationships), uncorrelated residual terms, and no explicit 
temporal asymmetries. Exogenous variables will not be employed 
in our estimating procedures nor considered as elements in paths 
linking causes to their effects. 

Lemer states his thesis about the universal modernizing role of 
literacy and media development in several places and in several 
ways. At one point he refers to the chicken and egg problem of 
reciprocal causal relations (37, p. 56) and forsakes speculative 
causal inference problems for testable correlational ones. But then, 
he argues in almost hierarchical causal fashion: 

. , . the Western model of modernization exhibits certain components 
and sequences whose relevance is global. Everywhere, for example, in- 
creasing urbanization has tended to raise literacy; rising literacy has 
tended to increase media exposure; increasing media exposure has “gone 
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with" wider economic participation (per capita income) and political 
participation (voting). (37, p. 56).*^ 

Only in the “gone with'^ link does he use language that is clearly 
correlational and not causal. 

At another point (37, pp. 58-65), Lemer makes inferences from 
multiple correlations to three phase historical and causal sequences. 
Relations between stages, if not within them, also appear to be 
hierarchical (recursive and unilateral). 

Within [the] urban matrix develop both of the attributes which dis- 
tinguish the next two phases— literacy and media growth. There is a close 
reciprocal relationship between these, for the literate develop the media 
which in turn spread literacy. But, historically, literacy performs the 
key function. . . . The capacity to read . . . equips [the population] to 
perform the varied tasks required in the modernizing society. Not until 
a third phase . , . does a society begin to produce newspapers, radio 
networks, and motion pictures on a massive scale. This, in turn, accel- 
erates the spread of literacy. Out of this interaction develop those in- 
stitutions of participation (e.g., voting) which we find in all advanced 
modem societies (37, p. 60). 

Mathematizing the causal relations. Before presenting the causal 
models derived from Lemer's work, let us briefly mention the 
operational procedures used in measuring the variables we shall 
study. Urbanization will be measured by the percent of a nation s 
population living in cities of over 20,000 population size; literacy 
will also be a percentage measure: percent of population over 
age 15 that is literate as reported to UNESCO sources. Media 
development will be measured by a combined index (whose 
elements are highly intercorrelated) of media items; per capita 
radios, newspapers, telephones, etc. The political participation 
index is a weighted sum of voting turnout data and a measure of 
political enculturation (two indices of participation which them- 
selves are rather distinctly intercorrelated).^® 

Let us now explicitly state several possible models implied by 
the Lemer quotations. Since he himself is quite explicit about 


D. Burnham has reminded me that American political development obviously 
does not fit this scheme; its high male political participation came before high media 
exposure and economic development. The Lerner model makes more sense in European 
and Afro-Asian contexts. 

References are given in Figure 1. It is clear that the operationalizations are at 
best tentative, especially since they are assumed to have interval scale validity. Some 
of the most profitable work on causal modelling in political science will obviously deal 
with variables that have lower levels of measurement. For a variety of causal models 
applicable to nominal scale data, see 14 , Chapters 4-6. 
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reciprocal linkages, our hierarchical models are obviously simplifi- 
cations of his thinking, useful mainly for illustrating causal model- 
ling procedures. Because all our data is from around the year 
1960, we in addition must assume rather than prove the relevance 
of cross-sectional analysis procedures for longitudinal inference. 

The simplest interpretation of the above quotes is a three-fold 
“stages of development” theory, in which urbanization brings 
about literacy, literacy then increases media development, which 
in turn increases political participation. By allowing only the 
developmental sequence of causal links above such an obviQ^sly 
oversimplified approach sounds something like a civics course 
(Emily Post version): come off the farm, learn to read, read the 
newspapers, and then vote wisely. Mathematically, this theory may 
be represented by equations (2a) and (2b): 

Xx =Ux 

a.xXx + X2 =U« 

a32X2 “|- Xs =XJ3 

a43X3-fX4 = U4 

SUxU2 = 2UxU3 = SUxU4 = XU2U3=2U2U4 = XU3U4 = 0 

In these equations Xi indicates urbanization, X 2 literacy, Xs ro,edia 
development, X 4 political participation; and Ui, U 2 , U 3 , U 4 are the 
corresponding residual causes, which are assumed to be uncorre- 
lated. In system (2a), Xi is the “dependent” variable in the' first 
equation, X 2 depends on Xi and Ua as indicated by the second 
equation, etc. An arrow scheme representation of this “simple 
stages theory” is given in Figure l.A. 

Figure 1. Three Causal Theories of National Political 
Participation Levels.* (N=85) 

A. Simple Stages Theory 
Urbanization (Xi) 

A 

Literacy (X 3 ) 

i 

Media 

Development (Xs) 

4 

Political 

Participation (X 4 ) 


(2a) 

(2b) 
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Deductions Predictions Results 

bi3.s=0 ris=risrM .41 vs. (.70) (.58)= .41 

b24.8=0 r24=r23r34 .66 vs. (.58) (.42)= .24 

bi4.a3=0 ri4=risrs3r34 .42 vs. (.70) (.41) (.42)= .12 

B. Lerner Theory 

Urbanization (Xi) 

1 / \ 

Literacy (Xj) >- Media Developmrait (Xa) 

\ Political i/ 

Participation (Xa) 

Deduction Prediction 

bi 4 .as = 0 Pia =1121*4 “(“ (rs4 — raaPas) ( Pis — PusT**) 

Results 

.42 vs. (.70)(.66) + 

/■.42 -(.66)(.58)7/'.41 - (.70)(.58)7=.46 
1- (.58)(.58) 

C. Revised Lerner Theory 

Upbanization (Xi) 

Literaty (X*) >- Media Development (Xa) 

\ Political 

Participation (Xa) 

Deductions Predictions Results 

bia.*=0 ris=ri 2 r 2 a .41 vs. .41 

bi4.23=0 ri4=riar24+ (raa — r24r2a)(ri3 — riaraa) .42 vs .46 

1 -r23* 

* Data on urbanization and literacy are from Russett, Alker, Deutsch, Lasswell, 
World Handbook of Political and Social Indicators (New Haven: Yale University Press, 
1964), Media Development is a factor index based on per capita radio, newspapers, 
telephones, and other data derived from the World Handbook and other sources. Po- 
litical participation data comes from Alker and Hopkins, reference 3. Basic ingredients 
are World Handbook data on percentage voting turnout and Banks and Textor data 
on political enculturation {The Cross-Polity Survey [Cambridge: M.I.T. Press, 19^3]). 

A much more realistic theory for explaining high levels of na- 
tional pohtical participation is given in Figure IB. As represented 
there, and in Equations (3a) and (3b) below, this "Lerner theory’^ 
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suggests the urban situation (Xi) as a cause of both literacy (X2) 
and media development (Xs), with literacy rather than the media 
playing the ^Tcey role"’ in the obviously reciprocal relationship be- 
tween media producers and consumers. Out of the interaction of 
both of these develop higher levels of mass political participation 
(X4). Note again how in each equation the ‘‘dependent” variable 
on the left of the equals sign is distinguished by its coefficient of 
unity. 

Xx =Ux 

a^iXi “b X 2 =U2 

asiXi *-|- a32X2 “)” X 3 "XJs 

a42X2 “j~ a43X3 “{*- X4~tJ4 

2UA-0 (i,j = l,...,4;i^j) 

As before, we are assuming in Equations (3b) that other things 
are equal— that residual influences on Xi, X 2 , Xs and X 4 are un- 
correlated with each other. 

It is interesting to compare causal systems (2a,b) and (3a,b) 
with the general (indeterminate) model (la). The new models 
have been restricted in several ways. Besides the obvious omission 
of exogenous variables and the “uncorrelated” residuals assump- 
tion, a subtler difference concerns the pattern of linkage or de- 
pendence coefficients, the as, A sufficiently large number of as 
have been assumed to equal zero*® so that it is possible to arrange 
all the remaining a coefficients into a triangular pattern (the X 
and a subscripts were in fact chosen so that this would occur). In 
a geometric sense, this kind of coefficient “pyramid” corresponds 
exactly to the set of recursive relations that characterize hierarchi- 
cal causal systems. In Equations (2a) and (3a), Xi is the first 
cause, literacy the second cause, and media the third. Each vari- 
able is caused only by random influences and other variables 
higher in the causal hierarchy. Figures lA and IB also suggest 
visually the same pyramid of unilateral causal dependencies.*^ 
Choosing among causal models. The tentativeness of all scientific 
arguments, in particular causal ones, means that at best we will 
fail to reject one or several after submitting them to the test of 

Other simplifying assumptions such as linear equations relating the a*Sj cqpld 
also have been made (see 34). These restrictions, however, would probably violate the 
present assumption of hier^chical causal relationships. 

third model, obtained by setting ag^ in Model (3a) equal to zero (Lerner at 
one point omits this link), is given in Figure Ic. It will be discussed in more detail in 
the next section of this paper. 


(3a) 

(3b) 
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experience. To test the hierarchical theories in Figure 1, we need 
to make deductions from their mathematical models, which (when 
empirically reinterpreted) we may match with observed relation- 
ships. 

At least two equivalent strategies for making deductions from 
the models of systems (2a,b) and (3a,b) are possible (a third is 
described in footnote 20). First of all, we can multiply together 
pairs of equations in either model (e.g., 2a), average these pro- 
ducts over all members of the population being studied, divide 
these products on both sides of the equals signs by appropriate 
standard deviations, and then set the SUiUj products equal to 
zero because of assumptions (2b) and (3b). These equations can 
then be solved for the values of the ay., in terms only of observable 
variances and correlations; if enough ay terms have been set to 
zero (more than C (4,2) =6 in the present examples), additional 
predictions among observable correlation coefficients will also be 
found. For more extended applications of this method the reader 
is referred to (45, Chapter 2; 2, Chapter 6; and 10). 

A second, simpler and more elegant approach is suggested in (7, 
Chapter 3 and 51, Chapter 2). Since the equations of hierarchical 
causal models are autonomous behavioral relations, by assumption 
the ""dependent” endogenous variable in each equation depends 
solely on the independent variables in that same equation and 
whatever variables these independent variables themselves depend 
on. Therefore, we can legitimately estimate by least squares or 
some other method the coefficients of any single equation in 
hierarchical models with uncorrelated random terms without taking 
the other equations explicitly into account. More precisely, the 
partial slopes corresponding to missing a coefficients within the 
coefficient pyramid should always be zero if we control for all 
other variables of higher causal order. Variables of higher causal 
order are those in higher rows of the coefficient pyramid. 

Turning to the models in Figure 1, this rule means that three 
partial slopes in Theory A, one partial slope in Theory B, and two 
partial slopes in Theory C should be zero. (These logical deduc- 
tions from the causal model are given on the left below each of 
the arrow diagrams in Figure 1). When partial slopes equal zero, 
corresponding partial correlations are also zero; therefore we can 
transform these deductions into the simplified ""predictions” con- 
cerning observable correlations given in the figure. (Formulas for 
higher order partial stops and correlations are obtainable from 
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many statistics books.) These predictions can be verified to be 
exactly the same as those made by the “mtiltiplying equation pairs” 
method described above. 

Looking at the results of our analysis, the comparisons between 
the values of the expressions on either side of the prediction jequa- 
tions, we see first that one prediction of the “simple stages iheory* 
is exactly correct, but that two others are way off. The Lemer 
theory, on the other hand, was one result that is within O.04 of 
being correct, a difference between correlations that could easily 
be attributed to sampling (or measurement) error.^* The %emer 
Theory,” in which direct causal links between urbanization and 
media development and between literacy and political participa- 
tion are added to the links of the simple stages theory, thus better 
survives our limited empirical test. 

A “revised Lemer Theory” as shown in Figure I.C., however, is 
also plausible. This theory omits a direct link between urbaniza- 
tion and media development; as such it is a compromise between 
the “simple stages theory” and the original “Lemer Theory.” Like 
the latter of these theories, it too resists falsification. Whether or 
not the revised theory is preferred to the original theory depends 
on such additional criteria as parsimony and realism. 

Estimating causal links and causal paths. Considering the num- 
ber of assumptions and operational approximations involved in 
attempting to test a longitudinal theory with cross-sectional data, 
the successful survival of the Lemer Theory is gratifying enough 
for us to proceed to estimate the magnitude of the links involved.^* 
Especially interesting, once the a’s are known, would be some idea 
of the magnitude of the various pathways linking urbanizatiou to 
political participation. 

For the original “Lemer Theory” (Figure l.B), we shall cal- 
culate both dependence and path coeflScients. By “dependence 
coeflScients” we mean standardized a coefficients in causally inter- 
preted linear stochastic models; “path coefficiaits” are products 

Sampling and measurement error problems in causal modelling are as yet not 
tborougbly explored. (See 7 and 33 , however, for some interesting beginnings). For 
recursive models it seems reasonable to expect that signi£cance tests of the magnitude 
of supposedly zero partial slopes would be feasible if the appropriate sampling, equal 
variance, and normality assumptions are satisfied. 

^®The ''stages theory” is only one of several other "plausible” three, four, and five 
arrow theories from which predictions were made that turned out to be incorrect. 
Since several other less plausible models make the same predictions as the (revised) 
theory, additional criteria for choosing among them are required. 
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of dependence coeflBcients along particular chains linking a causal 
variable to one of its effects (see Wright’s extensive development 
of these ideas in 53, 54, 55, and Boudons applications in 9, 10).®® 

Dependence coefficients for the Lemer Theory were calculated 
by separate least squares regression analysis of each equation in 
system (3a). Table 1 gives these and related path coefficients. As 
we might expect from the good predictions of the Revised Lemer 
Theory, the urbanization media link is weak; so is that between 
media and participation. A more challenging (but debatable) 
inference concerns the relative strength of two pathways from 
urbanization to participation; urbanization literacy partici- 
pation and the urbanization literacy media participation. 
Going causally from literacy to participation seems to characterize 
modernization processes more than the more indirect route throu^ 
media development. 

Since our data downgrade the causal role of the media in direct 
contradiction to the emphasis in Lemer’s writings, a partial analysis 
of other correlation data (using the same causal model) was 
attempted to see whether or not measurement error in the media 
index was responsible. The subsequent analysis indeed showed the 
media participation link to be stronger and the literacy par- 
ticipation one to be weaker. 


'Wright’s basic formula in the case of no correlated error terms is: **Any correla^ 
tion between variables in a network of sequential relations can be analyzed into con- 
tributions from all the paths (direct or through common factors [causes]) by which 
the two variables are connected, such that the value of each contribution is the 
product of the coefBcients pertaining to the elementary paths.” (53, p. 163). Applying 
this rule to the arrow model of Figure IB gives six equations, assuming a*s to be for 
standardized variables: 


I’l2 = *21 

^13 ~ *31 “f" *21 *82 

^14 = *21 ^42 4 " *21 *82 *43 4 ” *31 *48 

J ^23 = *32 4“ *21 *31 

^24 — *42 4" *32 *43 4“ *21 *43 *31 

rS 4 — *43 4" *32 *42 4" *81 *21 ^42 

These equations can be manipulated to give the same prediction (r^^ 23 ~ 0) given by 
the first and second derivation methods already discussed. They are in fact the same 
equations derivable by the “multiplying equations” method when the variables and 
coefficients involved are standardized, but this time obtained by inspection of the arrow 
diagram! Wright also gives a rule for writing down prediction and estimation equations 
when ultimate factors (those depending only on random terms) are correlated. (53, 
p.l63 f). Both of these rules are stated on the assumption that dependent variables 
are on the opposite side of an equals sign from independent ones (this is not true for 
3 a). Therefore in comparing results among different derivation procedures all a coeffi- 
cients in 3 c should each individually be preceded by a minus sign. 
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Table 1. Least-Squares Estimates of Standardized Dependence 
and Path CoeiEBcients for the Lemer Theory of Modernization* 
(N=85) ^ 

Dependent Variable Independent Variable Dependence Coefficient (a^^) 

Literacy (Xa) Urbanization (Xi) a2i=0.70 

Media (X3) Urbanization (Xi) a 3 i== 0,063 

Media (X3) Literacy (Xa) a32=0.57 

Participation (X4) Literacy (Xa) a42=0.64 

Participation (X 4 ) Media (5L) a43=0,051 

Dependent Variable Causal Path Path Coefficient 

Participation (X 4 ) Xi— ^X2“-^X4 Pi24=a2ia42~0.45 

Participation ( X 4 ) Xi Xa X 3 X 4 P 1254 — a2ia32a43 ~ 0.p2 

Participation (X 4 ) Xi X 3 X 4 Pm =a3ia43== 0.003 

Estimates are derived by the least squares method applied to the model (3 a,b) 
and data of Figure IB. They are equivalent to estimates derivable from the path 
coe^cient approach and Equations (3c) using the observed correlational yalues 
ri 2 = .70, ri 3 = .41, = .42, r^a = .38, r 24 = .67, = .42. The path coeffi- 

cients in the table add to 0.47, indicating an error of 0.05 from the true value of their 
sum (rj 4 = .42) if the model were exactly correct. It should also be noted that, un- 
like the convention of equations la, 2a, and 3a, positive dependence coefficients refer to 
positive dependence relationships. 

III. Reciprocal Causal Relationships 

In testing the model and estimating the parameters in the 
Lemer Theory, we had to remove any direct reciprocal ( -f - ) 

TinTfs between two variables in order to get determinative results. 
Such links are a special case of non-hierarchical circular or feed- 
back influence relationships. To illustrate a tiieory-building pro- 
cedure not restricted to the assmnptions of tmcorrelated random 
terms and hierarchical relationships, but including exogenous 
variables as influences on the endogenous ones, we shall consider 
another set of propositions regarding national systems." 

Om method of theorizing will, however, be different from the 
Lemer example in that interrelated propositions will be derived 
from the interaction between traditional conceptions of political 
alternatives and more empirically based data analysis. It is hoped 

We shall still have to assume that our findings indicate something like longitudinal 
(historical) causal relationships. All our equations may still be interpreted as probabi- 
listic behavioral generalizations (or laws), but they will no longer be assumed to 
indicate autonomous relationships. Because of the high level of aggregation in dealing 
with national systems, it will also be even more difficult than it was in the previous 
examples to identify specific actors, or within-nation mechanisms responsible for each 
of the causal links involved. Nonetheless wei shall sometimes refer to people in either 
illustration in such terms as **media consumers and producers,” ‘‘voters,” etc., when 
these terms seem appropriate. 
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that this one small example will illustrate how many of the insights 
of the rich qualitative tradition can be partially translated into 
more mundane, but more precise and testable, theories about 
crucial reciprocal political relationships.®® 

In particular, we shall attempt to state, test, and estimate para- 
meters for several theories about reciprocal relationships between 
communism and democracy in economically developed Western 
societies.®® Other mutually dependent variables will include levels 
of executive stability, political participation, and domestic group 
violence; each will be assumed exogenously and partially to depend 
on levels of urbanization, literacy, and economic development and 
other residual factors. 

Sources for the indicators to be used, and correlations among 
them, are given in Table 2. Endogenous, political indicators in- 
clude: 1) Communist voting as a percentage of national totals 
(Yale Political Data Program, etc.); 2) polyarchy (approximately 
as coded by Arthur Banks on the basis of the existence of a 
legitimate opposition, free press, elections, etc.); 3) domestic 
group violence (logged deaths as a fraction of population size, 
according to Rudolph Rummel); 4) political participation (as 
described in the notes to Figure 1, primarily an index based on 
voting turnout) and 5) average executive stability or tenure (ap- 
proximately as coded by the Yale Political Data Program). 
Literacy rates, urbanization and percapita Gross National Product 
are the exogenous variables. Symbolic labels for the endogenous 
(Xi, ... , Xs) and the exogenous variables (Zi, Z*, Za) are given in 
the margins of the table. 


The possibility of applying such models to ‘‘exchange” and “feedback” relation- 
ships such as consumer-producer exchanges has already been mentioned (see 77, 18, 19, 
45, 52), Similarly, Lerner presents a “circular” arrow model for the relationships 
among interest articulation, interest aggregation, and public communication, etc. (38, 
p. 348 ff) based on work of Almond and Coleman; he also discusses changes in the 
“vicious circle” of poverty necessary to bring about a “growth cycle” (38, p. 346 ff). 
Mamyama has reinterpreted Myrdal’s work on the growing gap between rich and poor 
nations (e.g., 43), in terms of deviation amplifying reciprocal causation (41). Literary 
commentators on politics also often stress cyclical relations: Sartre, for example, has 
emphasized how the oppression of the colons and the hatred of the colonial reinforce 
each other (see references in 20), 

Several of these clearly do not apply to Soviet Bloc countries or non-cmnmunist 
underdeveloped ones. Thus these theories will be more modest in their generality than 
those which Lerner claimed were valid “regardless of variations in race, color, creed” 
(37, p. 46), That some political (perhaps more than socioeconomic) “laws” differ in 
different contexts has been amply demonstrated. Ways of formulating (usually non- 
additive) causal theories that encompass such varieties have been discuss^ in (1, 14, 
15, and the references cited therein). 
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Table 2: Correlations Among 3 Socioeconomic and 5 Political 
Characteristics of 36 Economically Advanced 


Non-Communist Nations* 

Xi X, X, X. X, 

Zs 


Communist 

Vote(X0 1.00 .217 -.087 .170 -.251 

.179 -.172 

.008 

Polyarchy (X.) 1.00 .639 -.608 .193 

.487 .215 

.594 

Political Particip’n. (X,) 1.00 -.535 .526 

.769 .389 

.641 

Domestic Group Violence (X.) 1.00 —.392 

-.463 -.230 

-.440 

Executive Stability (Xs) 1.00 

.469 .402 

.507 

Literacy (Zi) 

1.00 

.670 

Urbanization (Z.) 

1.00 

.468 

Per Capita GNP (Zs) 


1.00 


’^ Nations all have x>er capita Gross National Products above $250.00. Socioeconomic 
indicators are taken from B. Russett, H. Alker, K. Deutsch, H. Lasswell, World Hand- 
hook of Political and Social Indicators (Yale University Press, New Haven; 1,964); 
political ones are factor indices derived in reference 3 and discussed in the text of the 
present paper. 

Mathematizing reciprocal causal relationships. A circular arrow 
diagram and an equivalent non-pyramidal system of reciprocal 
linear stochastic causal relationships are presented in Figure 2. 
The heart of this theory is the assumption that communist voting 
(Xi) and polyarchy (X^) are antithetical to each other. While do- 
mestic group violence (X4) breeds communist voters (because of 
the frustration it represents or causes), highly polyarchic systems 
are assumed to encourage legitimate participation (Xs) and to dis- 
courage or obviate domestic group violence. Legitimate participa- 
tion ( e.g., voting) is thought to reduce the need for violence; cpm- 
munist voting is modelled as decreasing the chances of democratic 
government (polyarchy). 

In a sense, these interrelated propositions follow from the ^hest^' 
arguments of traditional apologists for both democracy and com- 
munism.®^ Communism is supposed to follow domestic revolution 
and violence and, in the long run, to bury democracy. Polyarchy, 
on the other hand, is said by its modem advocates to increase 

^Models of this complexity do not spring full blown from correlation matrices like 
Table 1, even though they may be influenced by them. At first I tried hierarchically 
to take the **best case” arguments and put them together, with communist voting 
producing executive stability and high participation, and polyarchy decreasing violence 
and increasing participation. This didn’t work, even when a link equivalent to the 
‘'communism will bury polyarchy” argument was added. Turning to reciprocal models, 
I was unsuccessful in several attempts meaningfully to include executive stability and 
still get a set of reasonable parameter estimates. 
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popular participation in government and to decrease the likeli- 
hood of domestic violence. 

Figure 2: A Reciprocal Causal Theory of Equilibrium Rela- 
tionships Between Communist Voting and Demo- 
cratic Government in Economically Advanced Non- 
Communist Nations. 


A. Arrow Diagram^ 



Endogenous Political Links Exogenous Socioeconomic Links 

Xl X4 bl2 Z 2 -f- bl3 Zs^Ui 

cl 21 Xl -“b X2 “b ^21 Zl "b 1^23 Z3==U2 

(4a) 

R32 X2 “b X3 bsi Zl “b ^32 Z2 — U3 

R42 X« "b ^43 X3 *b ”b ^43 Zs ~ XJ4 

* Continuous signed arrows refer to predicted political relationships; signed dashed 
arrows refer to predicted socioeconomic links. Circled signs above the arrows indicate 
theoretical sign prediction was difiFerent from that of the estimate. 

Taken together these two points of view imply two potentially 
unstable circular relationships between polyarchy and communist 
voting, one via domestic violence, the other via participation and 
resulting changes in violence and communist voting. That an un- 
stable equilibrium is implied by these interrelated propositions is 
shown by assuming a random increase (or decrease) in com- 
munist voting. Following the arrow diagram, we see that chances 
for polyarchy would decrease and, by both circular paths, violence 
would increase, bringing more communist voters to the polls— 
truly a vicious cycle because these voters would further decrease 



SOLUTIONS TO METHODOLOGICAL PROBLEMS 


31 


the ejrtent of the nation s democracy. An initial spurt for polyarchy, 
on the other hand, would probably end in a world of democracies 
if these propositions were correct! 

Table 3: Estinqiated Dependence CoejBBicients for Original 
and Revised Equilibrium Theories of Communist 
Voting -- Democratic Government Relationships* 


Variables Original Theory Revised Theoiy 


Effect 

Cause 

(Figure 2. A) 

(Figu 

re 3. A) 

Estimatef 

Theory 

Estimatef 

Theory 

comm, vote 

violence 

ai4= 

1.30 

X 

814 = — 

-.49 

V 

polyarchy 

comm, vote 

a 2 i= • 

-.40 

X 

821 = — 

-.40 

V 

paiiiciph. 

polyarchy 

a32— ■ 

-.45 

V 

832 “ — 

-.20 

V 

violence 

polyarchy 

842 ~ 

.17 

V 

842 ^^ 

.24 

V 

violence 

particip^n. 

843 — 

.46 

V 

843 — 

.39 

V 

comm, vote 

literacy 




b„=- 

-.57 

V 

comm, vote 

urbanzh. 

bx,= 

.27 

X 


.34 

:v 

comm, vote 

income 

bi3= 

.44 

V 




polyarchy 

literacy 

b2i“ 

-.04 

V 

b 2 i“ — 

-.04 

V 

polyarchy 

income 

b23= 

-.57 

V 

b23~ ■“ 

-.57 

V 

participn. 

literacy 

bsi" 

-.53 

V 

b3i= — 

-.58 

Y 

participn. 

urbanzh. 

b32~ 

-.02 

V 




particip'n. 

income 




bss— — 

-.13 

V 

violence 

income 

b43= 

.05 

V 

b43~ 

.04 

Y 


* Estimates were obtained by least-squares regression methods from the reduced 
forms of Equations (4a: Figure 2.A) and (la: Figure 3A). The standardized regression 
coefficients in each case were the same. For dependent variables 1, 2, 3, and 4 (in that 
order) and exogenous Variables 1, 2, 3 (in that order) they were: .42, — .32, —.12; 
.20, — .13, .J2; ,62, — .026, .24; —.32, .04, — .25. A mimeographed sheet d^crib- 
ing the algebraic derivations involved is available from the author on request. 

t If the relevant link in the arrow diagrams of Figure 2 and Figure 3 has a plus 
sign, this means the related a or b coefficient should be minus, and vice versa, because 
all the X’s and Z*s are on the same side of the equals sign. 

Exogenous to these essentially political dependencies in Fi^re 
2 are a numbr of socioeconomic links with urbanism, literacy, and 
percapita GNP. Since this theory does not attempt to explain the 
causes of these variables themselves, we were able to draw on the 
earlier hierarchical modelling experience of this paper in choosing 
specific exogenous relationships. Use was also made of the siihple 
correlations in Table 1. Specifically, literacy was assumed to in- 
crease both polyarchy and popular political participation, urbani- 
zation (a la Lenin) to increase communist voting and political 
participation, and high per capita income to increase the chances 
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of democracy, while decreasing the appeal of both violence and 
communist voting. This last trio of democratic ‘"attributes" follows 
rather naturally, of course, from the work of Lipset and others on 
socioeconomic conditions for democracy. 

After mathematically specifying a complex model of political 
and socioeconomic interrelationships, there remained the problem 
of assuring oneself of the identiflability of the parameters implied 
by the model. If from the empirical distributions of both exogen- 
ous and endogenous variables, a coefficient estimating procedure 
will always lead to a unique set of estimates, a model's equations 
are exactly identifiable.*® The alternative situations are that a 
model is overidentified (several values of the as are predicted, 
or additional relations among correlations can be derived) or 
underidentified (in which case the model is indeterminate and 
we do not have enough information to obtain less than an infini- 
tude of coefficient estimates). Fortunately, the variety of exogen- 
ous links used in the equations of the reciprocal causal models in 
Figures 2 and 3 is sufficient for each of these equations to be 
identified.*® 

Identifiability is thus a theoretical property that may hold inde- 
pendently of the data used to estimate, validate, or falsify causal 
models. Stating or revising one's causal models so that each equa- 
tion will be identifiable is obviously an important theory building 
problem. Basically, for any question, other equations in the model 


This problem pertains to both hierarchical models (in which no uncorrelated 
residuals assumptions are made) and reciprocal ones. In general recursive models are 
identifiable only if uncorrelated residuals are assumed (see 10). If we had added a 
direct causal arrow between urbanism and participation in the first version of the 
Lerner Theory (Figure IB), no excess predictions could then have been derived from 
the model, although equations (3c) would still have given determinate results for the 
a*s. Adding one more causal link (and a coefl&cient) would have simultaneously made 
the Lerner Theory non-recursive and non-identifiable. Six equations in seven unknowns 
(the old a*s plus two new ones) would have allowed an infinite number of solutions 
for the values of these A good introduction to the identifiability problem may be 
obtained by reading (Hood and Koopmans, 29 ^ Chapter 2) and then (Valavanis, 50, 
Chapter 6). 

Actually, we tested one model not presented here, with fewer h coefl&cients than 
those in Figures 2 or 3, that was **overidentified.” It proved unsatisfactory. 

Koopmans has stated that: **A necessary condition for the identifiability of a 
[behavioral] equation within a given linear model is that the number of variables 
excluded from the equation (more generally, the number of linear restrictions on the 
parameters of that equation) be at least equal to the number of [behavioral] equations, 
less one. ... A necessary and sufficient condition for the identifiability of a [behavioral] 
equation within a linear model, restricted only by the exclusion of certain variables 
from certain equations, is that we can form at least one nonvanishing determinant of 
order G-1 out of these coefiicients, properly arranged, with which the variables excluded 
from that . . . equation appear in the G-1 other . . . equations.” (29, p. 38) Tintner 
(48, Chapter 7) gives a simple illustration of the application of these conditions. 
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have to be different enough from it so as not to be confused with it. 

Estimating and testing reciprocal models. Because reciprocal 
models in general cannot be used to generate predicted relation- 
ships among observable correlation coefficients (49, Chapter 6), 
some other method is necessary for choosing among reciprocal 
causal models. The approach to be used here suggests eliminating 
or failing to eliminate reciprocal causal models primarily oh the 
basis of correct or correctable theoretical predictions of the signs 
of estimated coefficients. Notice how in one sense we are |)rRg- 
matically using a theory in order to test it; this approach has little 
payoff, of course, unless some a priori degree of belief can be gen- 
erated concerning the direction of particular causal relationships. 

Estimates of the a*s and Vs in Figure 2 could be obtained by 
a number of econometric methods, including maximum likelihood 
analysis. The one used here is generally referred to as a “sophisti- 
cated least-squares procedure.”*®^ It assumes that the equations of 
a causal model, e.g. Figure 2, implicitly rather than explicitly indi- 
cate behavioral relationships. To get at the implicit coefficient 
values requires taking the simultaneous circular causal effects of 
dependent variables on themselves into account. Therefore^ all 
endogenous variables in a set of causal equations have simul- 
taneously to be solved for in terms of the exogenous variables 
from which unbiased coefficient estimates may be obtained by 
least-squares procedures. Going hack from “reduced form’' least- 
squares estimates obtained from equations relating each X to only 
exogenous variables and random terms is, in fact, one of the prime 
reasons why the identifiability question is raised: can we derive 
unique values of the a’s and Vs from least-squares estimates based 
only on the reduced form equations? For identifiable equations, 
the answer is yes. 

Turning now to Table 3 and Figure 2, we see that three coef- 
ficients (ai4, a 2 i, bi 2 ) have had their signs incorrectly predicted. 
Looking at Figure 2, and changing in our mind the circled (in- I 

The standardized coefficients were estimated by least squares analysis of the 
reduced form of the causal model of Equations (4a). This method allows foi* the 
interdependence of the X*s, solving for them in terms of exogenous variables 
and residual terms. This '^sophisticated least squares procedure** gets around the 'prin- 
cipal objections raised by maximum likelihood advocates. (See 50, 51 regarding this 
and other estimating procedures.) The present results were obtained by tedious algebraic 
derivations of the reduced forms of Equations (4a) and (5a), then computerized least 
squares analyses, and finally the calculation of the a*s and b*s in the unreduced equadons 
using these results. A more elegant approach to getting the X*s in reduced form is 
given by Equation (6), obtainable from Equation (lb) above: 

X = -A“^BZ + A“iU 


(O 
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correct) arrow signs, we see that our model signs now show a cir- 
cular reinforcement of polyarchy by communist voting, via less 
domestic violence and more communist votes or via greater par- 
ticipation, less violence and again more communist votes. Un- 
stable equilibrium is implied. 

A careful reanalysis was necessary to see why the sophisticated 
least squares procedure produced estimates violating Figure 2’s 
theoretical predictions. Trying, if possible, to keep the same num- 
ber of missing exogenous variables in each of the model’s equa- 
tions— in order to retain the uniquely identifiable characteristic of 
the model, it was first decided to assume a positive impact of lit- 
eracy on communist voting in line with similar effect already 
assumed for total voting levels in Figure 1. From the reduced form 
regression analysis, income seemed a less promising agent of de- 
creased communist voting (decreasing concentration of wealth 
would have been better), so it was removed from this link and 
joined to public political participation. A weak link (urbanization 
->■ participation) was also dropped. 

A close look at some particular cases (residuals analysis) helped 
suggest why two of the three wrong predictions had been incorrect 
(perhaps the third was due to improper specification of the exo- 
genous relationships). First of all, bu is positive because the hip- 
est communist voting levels have occurred in countries like Fin- 
land, Italy, France, and Chile, none of which is terribly urbanized. 
In Western Europe, communism appears to be more character- 
istically rural than urban, as at least one leading Chinese theore- 
tician would like us to believe. Similarly, aw is positive because 
these countries have average or above average polyarchy scores. 
Communist voting in Western Europe indeed occurs most sig- 
nificantly in countries tolerating radicd opposition.*® The existence 
of circa 15^—20% communist voting levels helps to maintain ( or is 
“functional for”) traditions and institutions tolerating dissent. Per- 
haps communist voters can legitimately relieve or outgrow the 
frustrations in this relatively harmless way. Apparently, commu- 
nist voting at these levels increases rather than diminishes system 
democracy. 

Testing of a revised “peaceful co-existence” reciprocal causal 
model (Figure 3) devised in light of this reanalysis gives more 
plausible size estimates for the a and b coefficients, as well as cor- 
rect signs for each of the twelve propositions summarized in Table 
3 under the Revised Theory heading. 
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Figure 3. A Revised Theoiy of Reciprocal Equilibrium,: Rela- 
tionships Between Communist Voting and Demo- 
cratic Government in Economically Advanced; Non- 
Communist Nations, 

A. Arrow Diagram* 


‘^urbanization 


X.): 




communist vote,. 


4. ^ — -^"^(^literacy 

=©C.' 

I polyai 




I poJyarchiy 
“'Xjf«*p^tical p^cij 


domestic 
group violence 


upoBtical'^participation 

2 ( 2 .'. 


B. Linear Model 

Endogenous Political Links. Exogenous Socioeconomic Links 

Xi SLu X4 -|- bii Zi bi 2 Z2 ~ Ui 

3'21 Xi -f- X 2 -{- b 21 Zi b23 Z 3 = XJ 2 

H5a) 

^32 X2 “I" Xs “f* bsi Zl "4“ bss Z 3 XJ 3 

842 X2 "4" "f" “h ^“*3 Za ~ U4 

* Continuous signed arrows refer to predicted political relationships; signed dashed 
arrows refer to predicted socioeconomic links. 

Comparing the signs and magnitudes of the various links of the 
revised communist voting-polyarchy theory, we see that the “vic- 
iousness” of the less polyarchy more violence more coirimu- 
nist voting less polyarchy circle has disappeared; it is no longer 
in unstable equilibrium. Both the three-arrow and five-aiTow 
circles now have an odd number of minus signs, if one calculates 
path coefiBcients along these two routes. Negative net path co- 
efficients indicate negative feedback, i.e., reequilibriating ten- 

See Allardt’s paper (4) for details about the existence of rural and urban com- 
munism in Finland. A hierarchical model reversing the direction of the communist vote 
—^polyarchy links gave unsatisfactory predictions. It should be quite clear, hoW:ever, 
that our small sample size means sampling errors associated with choosing one model 
over another would be large in the present context. 
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dencies in political processes (see 15, 19, 54) that are more con- 
sistent with our original assumptions about the longitudinal causal 
interpretability of cross-sectional analyses. Introducing time lags in 
relationships among these variables would then allow growth, 
equilibrium, or decay in such adjustment processes to occur. 

IV. Causation and Freedom 

Determinism attracts the social scientist for a number of reasons. 
Causal agents are seen as unmoved movers, while causal laws 
order chaotic experience. Thus causal explanations go below sur- 
face relationships to determinative realities. But the use of partly 
deterministic causal models does not imply the absence of free 
choice, even within the deterministic parts of these theories. The 
real problem is the use of more or less coercive or irrational power: 
"The distinction between free choice and behavior that is com- 
pelled is drawn within the domain of causation. ... A free choice 
is not uncaused, but one whose causes include in significant mea- 
suring the aspirations and knowledge of the actor who is choos- 
ing."" {32, p. 121). There are no apparent a priori reasons why 
choices freely or rationally made should persistently fail to exhibit 
lawful regularities; the same possibilities should also exist for co- 
erced or irrational choice. These regularities should not, however, 
be confused with the logical necessity of tautological mathematical 
relationships.^^ 

Despite such arguments as these, mathematical models of po- 
litical relationships for many political scientists continue to con- 
note the restriction of freedom rather than the satisfaction of 
curiosity or opportunities for political development. It may there^ 
fore be of value briefly to discuss the kinds of freedom assumed or 
implied in the mathematical models and causal theories discussed 
in previous sections of this paper. 

Political Choices, Unlike many of the recent quantitative studies 
of national political systems, the reciprocal theories of Part III 
above have stressed causal interdependence among political vari- 
ables.®® Levels of domestic group violence and political participa- 
tion were seen both to depend on and to influence collective de- 

Detailed discussions of the causation-determinism issue as it applies to human 
behavior may be found in references 22, 32, 39 , 40 , and 46 . 

The case for the autonomy of the political has been strongly put in Samuel 
Huntington’s recent review of the cross-national literature on political development 
(30). 
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cisions concerning the desired forms of political institutions. Eacl| 
of these variables can and should be thought of as representing 
observed regularities in partly autonomous collective pbliticm 
choices, more or less freely arrived at. j 

Even when a more realistic assessment of the importance o^ 
these mutually dependent links was made by taking into accotint| 
several important exogenous variables, political decisions (mea-| 
sured by the a coefficients) were seen to be at least as detennina-j 
tive of political outcome as socioeconomic causation (measured 
bythe&s).^^ 

Residual or random causes. In both the hierarchical and tjie re- 
ciprocal theories examined above, each dependent variable w'as 
assumed to be only probabilistically determined. The residual 
terms ( symbolized by the U*s ) indicated the lack of generality of 
each explanatory equation. If, as appeals to be the case, the U 
terms account for roughly between 0 and 50 percent of the Result- 
ant political behavior,®^ here too are important indications of self- 
generated or even chance behavior. Except when domestic group 
violence is involved, these "random'" phenomena need not be co- 
ercive ones. 

Variable dependence coefficients. Besides the political choices 
and random terms, another aspect of freedom or indeterminacy in 
the above mathematical theories is the dependence coefficients 
themselves. They were not predicted on an a priori basis; rather, 
they were estimated from a particular set of data. As originally 
specified, the arrow models indicated the existence and possibly 
the direction of selected causal links. The sizes of the relate^ de- 
pendence coefficients are estimated only after the relevant data on 
independent and dependent variables were collected. Because 
these findings are only descriptive of a particular set of data ( and 
certainly not a random sample at that), other dependence qpeffi- 

Herbert Simon has made much the same kind of argument including exogenous 
influences on presidential choices in making a more realistic assessment of presidential 
power: *Tf we regard the President as an ‘independent variable,* then we arrive at one 
assessment of his influence. If we add to our system the environmental influences c|reated 
by the administrative bureaucracy which greatly restricts the variability that 
ences in personal qualities and beliefs would otherwise produce . . . , we arriv^ at a 
smaller estimate of the influence of those personal qualities and beliefs.** {43 ^ p. 6B) 

These rough estimates were derived from Figure 3 and Table 4. The squared 
multiple correlation coefficients (R®*s) for the modePs 4 reduced form equation^ indi- 
cate that approximately 11%, 40%, 2J%, and 60% of the variation of their respective 
dependent variables was “predetermined** by socioeconomic variables. If the remi^ining 
variance is attributed about equally to political variables and also to random causes, 
the above 0-50% variance estimates are approximately correct. 
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dents consistent with the theories described are quite possible for 
different sets of data and different time periods. Granted that the 
choice of the degree of open competitiveness in national political 
systems is partly determined by a variety of other political and 
socioeconomic factors, changes in the relevant mix of these causal 
factors represent significant differences between various kinds of 
political systems.®* 

This means that such models can be used to define and m^s- 
ure structural changes associated with evolution or decay of po- 
litical systems. Changes in the very "laws” governing social and 
political relationships are, in fact, some of the most frequent topics 
of concern in the classical literature of politics. Causal models may 
even be used to explain why such ^adual or revolutionary breaks 
with the past have occurred! 

A multiplicity of empirically acceptable causal models. As our 
analysis of the original and revised Lemer models suggested, 
more than one causal model may be consistent with a particular 
set of observed correlations. This may be true whatever the gen- 
erality of the correlations concerned. Specifically, our procedure 
of rejecting or failing to reject particular causal theories never 
succeeded in eliminating all but one possible theory. At best sev- 
eral plausible theories were partly discounted. 

Mathematically, it is easy to show that several causal theories 
make the same empirical predictions (the developmental sequence 
X Y Z and the double causal situation X Y Z, for 
example, both imply that = 0). Since adding or subtracting 
variables and links from a model may or may not change the num- 
ber or nature of predictions involved, both mathematical and 
methodological injunctions to be tentative in advocating the truth 
of one particular model coincide. 

Theoretically, these possibilities allow for causal situations in 
which different but indistinguishable causal models are at work. 

fact that the polyarchy-communism model gives several coefficient estimates 
with different signs when applied to economically underdeveloped nations is a graphic 
illustration of the more or less autonomous changes that developed nations have made 
in the determinants of the political characteristics, even if we assume the same arrow 
model (without signs) to apply. 

In a similar vein, Talcott Parsons has frequently argued that an impressive achieve- 
ment of most industrialized Western democracies has been the depolarization of the 
lower class — radicalism voting relationship. "Status polarization** of the 1956 American 
presidential election, for example, was almost completely avoided. These findings do 
not imply, of course, that other determinants of voting behavior did not exist. See 
thp P-Tsons reference cited in 19 and the data in 11 . 
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Even within one causal model, there may be several pathways of 
change. Nonrecursive models of political competitiveness may in 
fact summarize and cover up two or more recursive explanatory 
models, each chosen by a number of national political systems. 

Determinism and freedom. It should be clear that the incom- 
plete specification of the causal models presented in this paper 
allows for a variety of choices and indeterminacies in the reality 
being described. In addition, the possibility has been suggested 
that political variables may themselves represent free and re- 
sponsible collective political choices, some of whose partially de- 
termined consequences we have tried to explore. Philosophically, 
this perspective corresponds to the humanistic view that social 
reality is only partly predetermined. Mathematically, the pro- 
cedures investigated have helped make explicit the variety and 
extent of the constraints and opportunities involved. 

V. Summary and Conclusions 

Within the variety of possible explanations for political events, 
social causation focuses on generalizations with determinative 
significance. Causal statements are also usually asymmetric in 
character, pay attention to pathways of influence between causes 
and effects, and tentatively assume other possible causes are being 
safely ignored or controlled. All of these aspects of causal expla- 
nation apply to recent social science attempts to abstract and 
generalize relatively precise and comprehensive arguments using 
linear stochastic systems. It need not be assumed that these gen- 
eralizations apply independently of the historical context "from 
which they are drawn. 

Within the cauSal modelling tradition there are again a variety 
of procedures for theoretically coping with segments of pojiticd 
reality. Many of them bear directly on philosophical arguments re- 
garding the nature of social reality. Models may be dynamic or 
static, stochastic or deterministic, dealing either with what we 
have called hierarchical or reciprocal influence relations. Major 
alternatives also exist as to the extent of isolation we assume, 
ceteris paribus, regarding the systems of relationships being de- 
scribed. The price for logically insuring identifiable outcomes from 
a system of equations, whatever their degree of implied isolation, 
involves some additional specifications regarding the size or ab- 
sence of some of the possible causal links, including possible links 
to exogenous variables whose causes themselves are not fully 
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assumed. We choose among such alternatives for a number of 
theoretical and personal reasons, but their scientific survival de- 
pends on resistance to empirical falsification. 

Within the causal modelling approach to political analysis there 
are a number of ways of accounting for human decisions and re- 
sponsibilities. Whether freely or coercively arrived at, individual 
and collective choices can be considered as themselves causally 
responsible for other political consequences. Probabilistic models 
confess from the start that specific outcomes cannot be exactly 
predicted, even if certain tendencies are known to occur. Statistical 
models with unspecified but partly restricted coeflScients reduce 
but do not eliminate the degrees of freedom associated with causal 
explanation. These models allow us to study both the environ- 
mental limitations and the deterministic consequences of political 
decision-making. Moreover, they provide parameters with which 
to measure historically varying structural forms of political activity. 

This paper has only briefly illustrated several ways of combining 
increasingly available political data collections, partly inductive 
causal inference techniques, and deductive, testable theories de- 
rived from a rich tradition of qualitative political analysis.®^ Even 
though they have not explicitly introduced the time dimension, 
our test cases have implied several quite distinctive possibilities 
about the ongoing nature of the political process. The “competitive 
coexistence” model, as applied only to non-communist states, con- 
tained reequilibriating tendencies, which could also be labelled 
“negative feedback.” The “communism vs. democracy” theory, on 
the other hand, was modelled as a case of disequilibrium, de- 
stabilizing change accomplished by positive feedback relation- 
ships. The hierarchical model of political development stood some- 
where between these two reciprocal systems as a case of unidirec- 
tional change (assuming its coeflGicients remain positive) without 

®*That the inductive and deductive interaction of concepts and data using these 
methods can go beyond merely obvious relationships is indicated by the differences 
between the magnitudes of the simple correlations in Table 1 and the causal links in 
Table 3. Straight forward factor analysis of Table 1 would not have discovered the 
causal configuration of Figure 2. 

There are a number of inductive data analysis procedures, however, which are be- 
ginning to suggest possible causal inferences. If simple additive causal theories are 
appropriate for example, factor analysis may detect them. If, on the other hand, de- 
velopmental sequences exist like those in Figure l.A, one should apply other techniques. 
Robert Abelson has informally suggested comparing structural simularities between 
Guttman’s theory of the simplex and the equivalent pyramidal structure of recursive 
causal systems. At least for simple learning problems, simple developmental sequences 
and nearly perfect simplices seem to exist that are causally interpretable. See 25, 26, 
27, 31, 
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major feedback relationships. Undoubtedly the world of politics— 
of power, influence, and authority relationships— includes all of 
these possibilities. 

Perhaps the strongest arguments in favor of causal models like 
those we have discussed is that they help answer the central *Vho 
gets what, when, and why"^ questions of political analysis. Leaving 
more or less implicit many of the persuasion or reasoning processes 
accounting for the magnitudes of certain dependence coefficients, 
causal models can nonetheless help explain what maintains or 
changes the distribution of political power, social respect, and 
mental and physical health within particular societies. Interna- 
tionally, for example, the tentative analyses of the present paper 
have suggested that most European nations have high political 
participation levels because of their high levels of urbanization, 
literacy, and media development. The citizenry of countries with 
little urbanization, many illiterates, and few mass media are less 
fortunate in this respect. Free political institutions in the Western 
world are in part maintained by their tendency to decrease violent 
domestic behavior which itself finds expression within more radi- 
cal voting positions that are tolerated only by open societies. 
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The Representational Model in 
Cross-National Content Analysis 

RICHARD L. MERRITT^ 

Yale University 

Systematic cantent analysis as a tool for political research is not 
particularly new. In primitive form it flourished during the 1930 s. 
Students of journalism and others ascertained attention patterns, 
as indicated by column inches or occasionally by word counts, for 
wide varieties of newspapers and other publications; they com- 
pared patterns of attention to political events in the same publi- 
cations over time; they contrasted political interest in large metro- 
politan dailies to that in small-town weeklies; they spent much 
time with questions of appropriate sampling and validation 
techniques. 

It remained for Harold D. Lasswell and his colleagues, how- 
ever, to develop content analysis as a tool specifically for com- 
parative political research. Their studies of attention patterns in 
the ‘"prestige papers” of five countries set standards of precision, 
clarity, and objectivity that students of comparative poHtical be- 
havior have sought to emulate for well over a decade. Since then 
David C. McClelland, Karin Dovring, Robert C. North, Robert 
C. Angell, J. Zvi Namenwirth, Richard L. Merritt and Ellen B. 
Pirro, and others have undertaken substantial content analyses of 
aspects of the communication process relevant for the cross- 
national study of politics. 

The increasing importance of content analysis as a research 
tool, no less than the fact that increasingly large sums are being 
spent for studies using the technique, suggests that the time has 
come to pause and re-examine some of its fundamental assump- 
tions. One important assumption concerns the nature of the “rep- 
resentational model” used in such studies, that is, the posited rela- 
tionship between observed and unobserved aspects of the commu- 
nication process. In this paper I shall express my own concern 
about developments along this line, paying particular attention to 
some of the cross-national content analyses of recent years. One 
caveat at the outset: If my comments appear unduly pessimistic, it 


^ Research on this project has been supported by the Yale Political Data Program, 
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is because of the cavalier treatment given to this problem by some 
scholars rather than because of the fact that the problem itself is 
irresolvable. TTie problem can be resolved. But, as I shall suggest, 
its resolution will require both serious thinking about the ineth- 
odology of content analysis and serious experimental work. 

Content Analysis and the Communication Process 

The communication process, in Lassweffs phrase ( slightly 
modified), deals with WHY WHO says WHAT to WHO^ and 
with WHAT EFFECT—expressed schematically in Figure 1. Con- 
tent analysis focuses on the message, or the WHAT in LasswelFs 
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formulation. It is the systematic, objective, and quantitative char- 
acterization of content variables manifest or latent in a message,* 
In principle any type of message may be content analyzed; inter- 
esting cross-national work has been performed on movies by 
Martha Wolfenstein and Nathan Leites;® on plays by Donald V. 
McGranahan;^ and on doodling and designs on vases by Elliot 
Aronson.® To date, however, most cross-national content analyses 
have dealt with written messages; and it is with these that this 
chapter will be primarily concerned. 

Content analysis research entails a number of distinct but inter- 
related steps. This is not the place to discuss its methodology at 
great length; but a brief outline of these steps, and some of. the 
problems encountered at each, will help to set the stage for some 
remarks on the representational model. 

® Bernard Berelson, Content Analysis in Communication Research (Glencoe: Free 
Press, 19 S 2), p. 18, This paper will not deal with the nonfrequency type of content 
analysis discussed by Alexander L. George, ^'Quantitative and Qualitative Approaches 
to Content Analysis,” in Ithiel de Sola Pool (ed.). Trends in Content Analysis (Urbana: 
University of Illinois Press, 1959), pp. 7-32. 

® Martha Wolfenstein and Nathan Leites, Movies: A Psychological Study (Glencoe: 
Free Press, 1950). 

^Donald V. McGranahan and Ivor Wayne, "German and American Traits Reflected 
in Popular Drama,” Human Relations, I (1948), 429-55. 

® Elliot Aronson, "The Need for Achievement as Measured by Graphic Expression,” 
in John W. Atkinson (ed.). Motives in Fantasy, Action, and Society: A Method of 
Assessment and Study (Princeton: Van Nostrand, 1958), pp. 249-65. 
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The Formulation of Hypotheses, Ideally, the analyst formulates 
his hypotheses (as weU as their alternatives) for testing at the 
outset of his project. Content analysis is useful only when the 
researcher has questions of a quantitative nature— how often? how 
much? how many? with what covariance?— that can be answered 
by counting the appearance of a limited number of content varia- 
bles in a given body of data. It is not particularly helpful if the 
task of research is merely to determine the timing of a sequence 
of events (such as the death of Stalin and the subsequent emer- 
gence of Khrushchev as the Soviet leader); it is of more use in 
trying to determine what eflFects the events had upon people's 
perceptions, attitudes, and values (such as in messages communi- 
cated by the Soviet elite). The task of the analyst is to frame his 
questions so that quantitative data can answer them clearly, di- 
rectly, and simply. 

It must be added that there is usually considerable interplay be- 
tween the hypothesis-formulation stage and data-gathering stages 
in a content analysis, as in other types of research. It may even 
turn out that the most fruitful hypotheses do not emerge clearly 
until after the analyst has examined his preliminary findings. 
Other important scientific discoveries have resulted from studies 
based on hunches rather than rigidly formulated propositions. 
Sometimes research of this sort is ineflScient; but the andyst who 
is sensitive to his findings may produce more interesting and 
meaningful results than the analyst who is blindly testing pre- 
formulated hypotheses. 

The Selection of an Appropriate Sample, The determination of 
what body of material could be used to test the hypotheses rests 
upon both the availability of data and the nature of the inferences 
to be drawn from the analysis. To get an idea of values current 
among Soviet elites, for instance, it would be ideal if we had 
access to the minutes of Presidium meetings. But such data are 
not at our disposal. In their absence, will the news columns of 
Pravda or Izvestia give us the information we want? Similarly, if 
our files of the most nearly ideal body of material are incom- 
plete, it may be necessary to work out a compromise: accepting 
the information loss due to missing data; estimating the nature of 
the missing data through statistical techniques already developed; 
selecting a second-best source of data; possibly even using avail- 
able files of the first choice as a check on trends present in the 
^'econd choice. 
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The sampling procedure itself is a function of the type and^ 
amount of information needed to test the hypotheses as >yell as| 
of what economists call the “opportunity cost” of securing a Certain 
amount of information. Appropriate sampling techniques include 
random sampling, using a table of random digits; systematic 
sampling, selecting every nth item in a series, or picking news- 
paper issues on the 1st and 15th of every month; and stratified 
random sampling for bodies of material that can be broken down 
into discrete categories. 

Validating the sample, to see whether or not it is actually rep- 
resentative of the universe of items from which it was drawn, can 
be problematical. For random samples, standard statistical tech- 
niques (e.g., “split-halves” technique within the sample itself, or 
comparison of the sample with an independent sample froifa the 
s^e universe of items) are readily available. Often in political 
research, however, we cannot be quite certain of the randoii|iness 
of the sample. Published foreign oflSce documents, for instance, 
are clearly not exhaustive of all documents in a country’s foreign 
oflSce. Compilers of such documents necessarily use some criteria 
of relevance in deciding which items to include and which to 
exclude. The extent to which a random sample of the published 
collection actually approximates the distribution of documents in 
the entire files is a question that demands an answer if we are 
to credit any content analysis of the sample. Statistical techniques 
will tell us whether or not the sample is representative of the pub- 
lished documents, but correction factors are necessary to answer 
the more difficult question. Or else, some serious digging must be 
done in the particular country’s foreign office files. 

The Selection of Units of Analysis, Content analysts have gen- 
erally used three types of analytical units: space, symbols, and 
themes. Determining the relative amount of space devoted in a 
message to a particular topic is often a good indicator of the com- 
municator’ s concern with the topic. If we view words as symbols 
for content analysis purposes, then establishing a list of relevant 
symbols is a crucial step. The experience of the RADIR project— 
which concentrated on “symbols supposed to reflect trends in 
world politics with particular reference to changing attitudes 
toward the values of democracy, fraternity, security, and well- 
being”— is instructive in this regard. Pool writes; 


Our own procedure in attempting to draw up a relatively valid list was 
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to draw upon the best knowledge available and to use a long enough 
list so that the arbitrary decisions about inclusion or exclusion would 
affect the relatively infrequent terms in the tails of the word usage dis- 
tribution, rather than more common words. To draw up the list we 
called upon Harold D. Lasswell, for thirty years one of the leading stu- 
dents of political movements and propaganda. The list he drew up con- 
sisted of nouns, although the listed words were also counted when they 
appeared in other forms. The list was then subjected to the test of use. 
Any expert, by pure oversightj might omit some symbols of obvious im- 
portance. Our readers were, therefore, instructed to note and report any 
additional symbols that seemed appropriate to the list.® 

Some of the more recent cross-national content analyses have 
dealt with themes. Angell delineated 40 value dimensions relevant 
for Soviet and American ideology (e.g,, ‘‘Mode of Ownership of 
Property’"), and coded “elite” publications in the two countries 
according to several possible positions along each dimension. 
McClelland searched childrens readers in 41 countries for their 
concern with a need for achievement, affiliation, and power. And 
Stones General Inquirer “tags” words (which can also be used as 
symbols) according to a predetermined list of concepts. Which 
t^e of content variable is most appropriate for a particular analy- 
sis rests, of course, upon the type of information needed to test the 
researchers hypotheses. 

Establishing Procedures for Counting. Perhaps the simplest type 
of content analysis uses a frequency count of the appearance of 
the content variables. Merritt, in his examination of the colonial 
American press, for instance, tabulated the frequency which which 
place-name symbols occurred.^ Angell counted the frequency with 
which Soviet and American publications took positions along his 
40 value variables. In the former case, each reference to a unit of 
analysis was recorded; in the latter, no content variable could be 
coded more than once in any single communication. 

It is also possible to add vectors to frequency counts. The 
RADIR studies noted whether the context of the tabulated sym- 
bols was positive, neutral, or negative. The Stanford project on 
conflict and integration, directed by North, codes communications 
along several dimensions, such as “good-bad,” “active-passive,” 

® Ithiel de Sola Pool, with Harold D. Lasswell, Daniel Lerner, et at.. The ^*PresHge 
Papers**: A Survey of Their Editorials (Stanford: Stanford University Press, 1952), 
pp. 16-17. 

^Richard L. Merritt, Symbols of American Community, 1735~1775 (New Haven; 
Yale University Press, 1966). 
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"strong-weak/* and "liostility-fiiendship.”* The Yale Arms Control| 
Project coded French, German, British, and American editorial 
responses to arms control proposals along 5-point or 7-point scales 
according to the perceived specificity or diffuseness of the pro- 
posal, its operationality or nonoperationality, the level of affect 
displayed, and so forth.® 

In either event objectivity requires that these technical aspects 
of the content analysis be specified in advance, and that through- 
out the analysis there be strict adherence to the coding procedure. 

A special problem that arises with computerized content analysis 
is the transformation of existing material into texts that can be 
handled by current programming techniques. In the case Of the 
General Inquirer, this requires a certain amount of editing: 
breaking "complex sentences down into simple thou^t-sequence 
units’’;"* adding information not normally found in the computer's 
memory drum (e.g., adding parenthetically "warm vacation place” 
to references to Florida); clarifying the referent of ambiguous 
words (references to the singer George London and the city of 
London), and “tagging” some words or combinations of words 
relevant to concepts in which the analyst is * interested ( e.g., 
"affect,” "European economic integration”). The Stanford project 
utilizes "evaluative assertion analysis” which translates messages 
into a "simple, three-element assertive format.”^^ Such transfqrma- 


®C/. the special issue on "Case Studies in Conflict/* edited by Robert C. l^orth, 
Journal of Conflict Resolution, VI {1962), 197-268; and Robert C. North, 6le R, 
Holsti, M. George Zaninovich, and Dina A. Zinnes, Content Analysis: A Handbook 
with Applications for the Study of International Crisis (Evanston: Northwestern 
University Press, 1963), This paper does not deal with the representational model used 
by the Stanford project, primarily because I have found nowhere a detailed discussion 
of sampling techniques; r/. J. David Singer, “Data-Making in International Relations,” 
Behavioral Science, X (1965), 68-80, especially p. 71. 

®For a preliminary analysis, see Richard L. Merritt and Ellen B. Pirro, "Press Atti- 
tudes to Arms Control in Four Countries, 1946-1963 (New Haven: Yale Univi^ity, 
Political Science Research Library, mimeographed, 1966). 

Philip J. Stone, Robert F. Bales, J. Zvi Namenwirth, and Daniel M. Ogilvic, "The 
General Inquirer: A Computer System* for Content Analysis and Retrieval Based on the 
Sentence as a Unit of Information,** Behavioral Science, VII (1962), 492. 

North et aL, Content Analysis, p. 91. The difficulty of working with evaluative 
assertion analysis may be illustrated by the example of its operation given on j^. 93. 
The main author of the chapter in question, Holsti, turns the following sentence 'from 
a Communist Chinese newspaper, 

The treacherous American aggressors are abetting the corrupt ruling circles of Japan. 
into four evaluative assertions: 

1. Americans are treacherous. 

2. Americans are aggressors. 

3. Americans are abetting Japanese ruling circles. 

4. Japanese ruling circles are corrupt. 

If we follow the normal canons of logic, the four assertions are most assuredly not 
a reasonable restatement of the original sentence. 
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tions, although seemingly simple, may contribute significantly to 
the level of error in any content analysis. 

Training Coders and Testing Coder Reliability. There is gem 
eral agreement that coders need to be sufficiently trained and 
have enough understanding of the coding categories that two 
coders working independently will produce quite similar results. 
This implies a necessity for working out coding manuals and 
other training procedures. When coding is being performed for 
the analyst who originated the coding procedures, there is a 
marked tendency to postpone any effort to formalize the tech- 
niques, that is, to write them down in detail. If students and 
scholars at other universities are to be able to use the procedures, 
however, explicit and precise coding manuals are imperative. 
With time, as computerized content analysis becomes more fuUy 
developed, it will be possible simply to exchange data prepara- 
tion routines and computer programs that can be used anywhere 
by a novice in "cookbook” fashion. 

Testing intercoder reliability is a relatively underdeveloped 
facet of content analysis. This is not to say that techniques to 
measure reliability have not been developed. Or even that Berel- 
son’s complaint of a decade and a half ago is still valid: "What- 
ever the actual state of reliability in content analysis, the published 
record is less than satisfactory. Only about 15-20X of the studies 
report the reliability of the analysis contained in them.”^® In fact, 
the most important cross-national analyses of recent years have 
been quite careful to discuss their problems of reliability. 

Two key aspects of intercoder reliability checks have nonethe- 
less received insufficient attention. The first is the question of 
acceptable levels of reliability. What does it mean when a con- 
tent analyst reports that his reliability score for two coders, using 
a simple percentage agreement test,^® is .70? How much more 
useful or valid is the analysis if the percentage agreement is .80 
or .90? How does a reliability coefficient for the percentage agree- 
ment test compare with Scott^s reliability index or with a Pear- 
sonian product-moment correlation coefficient? Second, very little 
experimental information exists on the determinants of coder 
reliability. What role does the explicitness of the instructions in 

^^Bcrelson, Content Analysis in Communication Research, p. 172. 

For an excellent discussion of reliability indices, see William A. Scott, “Reliability 
of Content Analysis: The Case of Nominal Scale Coding,” Public Opinion Quarterly, 
XIX (1955), 321-25. 
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the coding manual play? What type of training and practice pro- 
cedures are most likely to enhance coder reliability? Does the 
diflSculty of securing high intercoder reliability coeflBcients increase 
if themes rather than symbols are coded? \\Tiat impact docs the 
educational and intelligence level of the coder have upon his; per- 
formance? It seems to me that attention to these basic issues 
would be fruitful for the future development of content analysis/^ 

Inferences from Content Analysis 

As suggested earlier, content analysis focuses on the message— 
or the WHAT in LasswelFs formulation. Our reasons for wanting 
to know the substance or form of the message may be variou^. On 
the most trivial level, the message may merely be of intrinsic 
interest to us: we may be curious to know, for instance, what the 
Western European press says about a particular arms control 
proposal, or the frequency of certain types of word usage in the 
editorials of "elite"^ newspapers. More frequently, however, wc are 
interested in the message because we think it contains clues about 
other, less directly observable, aspects of the communication 
process.^® 

Sometimes the content analyst is interested in the recipients of 
a set of messages— the WHOM of the earlier formula. Part of the 
justification for using ‘'prestige papers” to estimate the mood of 
elites is, according to Pool, that these newspapers are “read by 
the elite.”^® The question of readership posed by Pool’s assertion 
may be looked at in two ways. On the one hand, we would like 
to know who the actual, as opposed to the intended, recipients 
of the message are. Who in fact reads the New York Times? Of 
those, who reads the editorials? What percentage of the reader’s 
total time spent per day in gaining new information is devoted 
to perusal of the Times? On the other hand, what do the intended 


What is proposed is methodological research similar to that on interviewer bias 
performed by early researchers. For a graphic description of the process of tj-ain- 
ing coders, see Charles P. Smith and Sheila Feld, **How to Learn the Method of 
Content Analysis for n Achievement, n Affiliation, and n Power,” in Atkinson (ed.). 
Motives in Fantasy, Action, and Society, pp. ^85-818; and Feld and Smith, **An Evalu- 
ation of the Objectivity of the Method of Content Analysis,” in ibid., pp, 234'-41. 

^®A number of important points cannot be discussed in this paper. One is the 
problem of error stemming from the encoding and decoding aspects of the communi- 
cation process. A second is the question of inferences drawn on the latent and manifest 
levels of communication. Third, there is the issue of whether the message performs an 
mstrumenial or representational function for the communicator. On this last point, 
see Pool (ed.), Trends in Content Analysis, pp. 206-12. 

^®PooI et al., The **Frestige Papers,” p. 7. 
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recipients of the messages in fact read? What percentage of them 
reads the message? Which of them also read other ( and possibly 
contradictory) messages as well? As will be suggested later, the 
answers to such questions lie not in a content analysis itself, nor 
even in the force of logic. Questions of actual as opposed to 
intended readership lie more properly with various types of media 
analysis through survey research. 

The issue of WHAT EFFECT the message has upon its recipi- 
ent is still thornier. Pools assertion that the ‘‘prestige” newspapers 
are not only read by elites but also “influence them” raises un- 
answered questions about individual and group decision-making 
processes.^^ To be sure, it is important to know what is made 
available to a decision-making system. But it is even more im- 
portant to know what is assimilated or accepted for use by the 
system. For instance, suppose that a person reads a message 
telling him to vote for a particular cancfldate in an election, and 
then goes out to vote for him. Can we infer a causal relationship 
between the message and the ballot? It may be that the person 
happened to pick up the message as he was already on his way to 
vote for the candidate. Or it may be that persons likely to vote for 
the candidate are more likely than others to happen upon such 
literature. Or it may be that the message did inde^ persuade the 
voter to opt for the candidate. At this stage in political research, 
determining the effect of communication upon attitude change is 
simply not a function of content analysis itself (unless the analyst 
has independent validating evidence, such as that produced by 
experimental psychology, in which case the content analysis may 
be superfluous). 

Sometimes we are interested in determining WHO the com- 
municator is. This is the case in propaganda analysis where we 
assume that, if we know the source of a message, we shall also 
know the extent to which it is likely to contain biased information. 
Lasswell and his colleagues used content analysis techniques 
during World War II with great effectiveness to determine the 
extent to which certain American publications contained news and 
editorial comment stemming from Nazi sources.^® Discovering who 
the author of a message is has also been important in some types 


Ibid, 

Harold D. Lasswell, ''Detection: Propaganda Detection and the Courts,” in 
Harold D. Lasswell, Nathan Leites, and Associates, The Language of Politics: Studies 
in Quantitative Semantics (New York: Stewart, 1949), pp. 173-232. 



SOLUTIONS TO METHODOLOGICAL PROBLEMS 53 

of literary detective work. Recent eflForts by Mosteller and 
Wallace, using electronic computers, to infer who wrote which 
of the Federalist papers are exemplary in this regard.^® 

In cross-national political research it is usually clear who the 
communicator is. Sleuthing is generally directed to other ends. The 
question of who the communicator is nonetheless raises in ele- 
mentary form the basic issue of the representational model lused 
in content analysis research. That is, are we interested in the com- 
municator himself, because of his personal attributes? Or dp we 
examine his messages because he seems to be speaking for some 
other group, such as the organization or culture of which he is 
a member? Another way of looking at these questions is to ask 
what motivates the communicator: WHY does he transmit a par- 
ticular message? 

The Representational Model: 

Why the Communicator Communicates 

Individual motivation rests upon a variety of subtly operating 
factors in the human psyche. Not the least of these is the najhire 
of the information that an individual has at his disposal wheii he 
makes decisions. The amount of information available to the' in- 
dividual is limited by both chance and choice. He does not see, 
for instance, most newspapers published in the United States, nor 
is it likely that he could manage to read them were they aU de- 
livered on his doorstep. Every individual consciously and uncon- 
sciously screens out certain types of information: he may de- 
liberately choose to skip some sections of his morning newspaper, 
such as the women’s page or the financial section; if he reads the 
paper when he is tired he may miss some of the more subtle points 
expressed by editorial writers; moreover, experimental evidence 
indicates that some people literally do not see certain items that 
disagree with their preconceptions. In contrast to the input of 
current information— values, attitudes, beliefs— there is also infor- 
mation stored in memory. In the individual’s active memory is 
much information that can be readily recalled, information ranging 
from the date of his birth to his perception of the coiu-se of events 
in Vietnam. More deeply stored information includes items, of 
very low salience, such as the telephone number of his childhood 
residence, as well as such repressed data as painful emotional ex- 

Frederick Hosteller and David L. Wallace, Inference and Disputed Authorship: 
The federalist (Reading, Mass., Palo Alto, Calif., and London: Addison Wesley, 19^4), 
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periences in childhood. Individual motivation also rests upon a 
person's perception of alternative courses of action as well as their 
likely outcomes. Some behavior is purposeful; a person postulates 
a set of goals and then implements them as best he can. At the 
same time it must be added that random or habitual behavior 
often plays a role in the communication process, in determining 
what things a person will communicate and how he will com- 
municate them. 

In short, individual motivation is at best a complex mix of both 
current and stored information, perceptions of modes of behavior, 
and some nonrational factors such as chance and habit. If the 
task of content analysis is to infer a person s motivations from his 
messages, then what is needed is a sound theory bridging the 
gaps among motivation, verbal behavior, and other forms of 
behavior. Freudian psychology presents one possible bridge; the 
goal of the psychoandyst (who was instrumental, by the way, 
in the development of content analysis techniques) is to try to 
account for individual behavior through the examination of a wide 
range of the individual’s messages. Some scholars have even tried 
to ‘psychoanalyze” historical personages by content analyzing their 
verbal messages and comparing these messages with those pro- 
duced by currently living personality types, whose characteristics 
have been analyzed clinically.®® 

The problem of motivation becomes still more complex as soon 
as we move from the personal to the public realm. Political psy- 
chology aside, content analysis generally deals not with the private 
utterances of a man lying on a psychoanalyst’s couch but with his 
public messages— the speeches he delivers, the pictures he paints, 
the memoranda and position papers he drafts, the editorials he 
writes, and so forth. If we are looking for the reason— or motiva- 
tion— for such communications, then we may examine either the 
man’s personality structure, or his relationship with the environ- 
ment, or both. The question to be asked is. Whom or what does 
the individual represent when he communicates? 

One possible answer is that he represents himself and no one 
else. He is seeking to express his own mind rather than pretending 
to be the spokesman for any group or culture. Such an answer, 
however, poses new questions: (1) How accurately does the 

^For an excellent example, see Alexander L. and Juliette L. George, Woodrow 
Wilson and Colonel House: A Personality Study (New York: Day, 1956). 
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message reflect his "true'' feelings? (2) Why did he choose the 
particular mode of communication that he did? (3) Why did 
others permit the article to be published or the speech to be de- 
livered publicly? To what extent were they in agreement with the 
values, attitudes, and beliefs expressed in the message? If the 
level of agreement were high, then we might argue that the com- 
municator, regardless of his intentions and preferences, mdy be 
perceived to be "representing" someone else ( e.g., those in control 
of the means of communication). (4) What influence did the 
communicator s group memberships have on the substance and 
form of his message? To what extent was the communicator aWare 
of such group influences? Did he seek to counteract them? Again, 
regardless of intention or preference, the extent to which the 
message reflects the actual constellation of group values, attitudes, 
and beliefs may be taken as an indication of the extent to which 
the communicator represents the group. The task, however, is to 
discover the degree of congruence. 

An alternative answer is that the communicator is in fact rep- 
resentative of some other group. Thus we may be less interested 
in the remarks of a specific general as an indicator of his personal 
views than as an indicator of what "generals" or even the "military 
elite" think. Among the significant questions that arise if we take 
this position are the following: (1) How accurately does the com- 
municator's message reflect die "true" feelings of the group? (2) 
To what extent is the linkage perceived or consciously sought by 
either the individual or the “represented" group? In the previous 
paragraph I suggested that a representational model might be 
inferred in certain circumstances despite specific disclaimers on 
the part of the communicator that he does or is seeking to speak 
for the particular group. The other side of this is the extent to 
which a communicator may be said to speak for a group even 
though the group disavows him and openly rejects his views. (3) 
In the presence of a clear link between communicator and group, 
how can we tell whether the communicator is consciously trying 
to mirror group attitudes or whether he is writing to persuade the 
group to adopt new attitudes? In the latter case the manifest con- 
tent of the message might deviate substantially from group norms. 
(4) Given the fact that most people are members of more than 
one group, what is the mix of different group influences that is 
relevant for any single individual's messages? ^^en a doctor, who 
happens also to be a Catholic and of Italian extraction, writes an 
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editorial in the Journal of the American Medical Association, how 
sure can we be that his views "represent” those of the rest of the 
medical profession, which comprises by and large Protestants of 
Anglo-Saxon origin? (5) To what extent is any message that a 
person communicates influenced by the overall culture of which 
he is a member? That is, how much by way of group or cultural 
values creeps autonomously into every message? 

If the purpose of content analysis, then, is to extrapolate from 
observed variables in messages to nonobserved motivational vari- 
ables, two interrelated questions are crucial. First, is the com- 
municator perceived or assumed to be representing his own views, 
those of the group or groups to which he belongs, or those of his 
overall culture? And, second, what mix of conscious and uncon- 
scious elements goes into the formulation of his message? Let us 
turn to some of the more recent cross-national content analyses to 
see how such questions have been treated. 

Pool et ah: The ‘"Prestige Papers^' Perhaps the most elaborate of 
these studies is the Hoover Institute’s research project on Revolu- 
tion and the Development of International Relations (RADIR 
Project). The published portions of the project analyze symbols 
of democracy and internationalism in newspapers from five coun- 
tries, covering the years from 1890 to 1949: 


Great Britain 
Russia 

United States 

France 

Germany 


The Times (1890-1949) 

NovoeVremia (1892-1917); Izvestia (1918-1949) 
The New York Times (1900-1949) 

Le Temps (1900-1942); he Monde (1945-1949) 
Norddeutsche allgemeine Zeitung (1910-1920); 
Frankfurter Zeitung (1920-1932); Volkischer 
Beobachter ( 1933-1945 ) 


As justification for the decision to examine editorials in these 
newspapers. Pool writes: 

In each major power one newspaper stands out as an organ of elite 
opinion. Usually semiofficial, always intimate with the government, these 
"prestige papers” are read by public officials, journalists, scholars, and 
business leaders. They seldom have large circulations, yet they have enor- 
mous influence. They are read not only in their own countries, but also 
abroad by those whose business it is to keep track of world affairs. They 
differ among themselves, but, despite national and temporal differences, 
they are a distinct species. It is generally possible to name with fair 
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confidence one paper in any given country which plays the role of pres- 
tige paper at any given time. 

The prestige paper is in some respects a good index of elite behavior. 

It is read by the elite and influences them. In addition, it is produced 
by men who have themselves become part of the elite and share the 
tj^ical life pattern of the elite.®^ 

The argument is plausible, but is it true? We know that the "pres- 
tige papers” are representative of something or someone. But of 
what or of whom? To take a recent example, New York Times 
editorials on the issue of American participation in the Vietnam 
struggle, prior to the summer of 1965 at least, could scarcely be 
called indicative of government policy or even of informed opin- 
ion among American elite groupings. It is doubtless true that— 
over the long run, and given a wide range of issues— the New York 
Times is closer to "oflScial” or "elite” opinion than any other single 
publication in the United States. Despite the fairness of this 
assumption, it cannot be a fully satisfactory answer to the ques- 
tion raised above until empirical tests can show an actual (as 
opposed to an imputed) relationship between the distribution of 
attitudes in New York Times editorials and policies pursued or 
attitudes held in official or elite quarters. 

A second set of questions was raised earlier: Does the elite in 
fact read the prestige papers, either in the United States or e^e- 
where? And how can we verify whether or not the editorials 
influence those who read them? In these regards intensive inter- 
views with samples of elite groupings might give us relevant 
answers.^ 

A third question is the extent to which the prestige papers coin- 
pare in their expressed values, attitudes, and beliefs with the 

Pool, et al., The **Vrestt^e Tapers/* pp. 1, 7. In this analysis I shall not d^al 
separately with other studies using a similar representational model. Cf. Wiihur 
Schramm (ed.), One Day in the World* s Press: Fourteen Great Newspapers on a Day 
of Crisis t with Translations and Facsimile Keproductions (Stanford: Stanford Univ^- 
sity Press, 19 S9); and J. Zvi Namenwirth and Thos. L. Brewer, "Elite Editorial Coip- 
ment on the European and Atlantic Communities in Four Countries,** in Philip J. 
Stone, Dexter C. Dunphy, Marshall Smith, and Daniel M. Ogilvie (eds.). General In-- 
quirer: A General Approach to Content Analysis (Cambridge: M.I.T. Press, 1966) » 

What I have in mind is a series of studies similar to those performed at the Survey 
Research Center of The University of Michigan on the interaction between congress- 
men and their constituents; cf. 'W'arren E. Miller and Donald E. Stokes, "Constituency 
Influence in Congress,** American Political Science Review, LVII (1953), 43-55, A b^ 
ginning may be found in James N. Rosenau, National Leadership and Foreign Policy: 
A Case Study in the Mobilization of Public Support (Princeton: Princeton University 
Press, 1953). 
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other newspapers in their own countries. Two projects currently 
under way at Yale University are seeking clues to resolve this 
problem. One, under the direction of J. Zvi Namenwirth, is con- 
tent analyzing three “elite” and three “mass” newspapers in the 
United States, using the General Inquirer procedure. The other 
is investigating editorial attitudes in a wide variety of French, 
West German, British, and American journals toward specific arms 
control events and proposals; a comparison of the prestige papers 
with the others will at least give us an idea of how typical they 
are of the press of the different countries. 

Finally, the study of “elite” newspapers poses a problem similar 
to that faced by students of community power structures who 
concentrate upon “community influentials.” A newspaper may 
enjoy a reputation for influence when in fact it is not irffiuential. 
Other newspapers, although perhaps somewhat less “intellectual” 
than the prestige papers, may be widely read by elite groupings. 
Or it is possible that a newspaper loses whatever influence among 
the elite it once had. If we continue to concentrate upon attention 
and value patterns in newspapers after they have passed their 
zenith, we may be deluding ourselves about actual trends in the 
country. But, then, how do we know when the star of an elite 
journal is falling and that of another publication is taking its place? 

Karin Dovring: Land Reform as a Propaganda Theme, The 
focus of this study is the ideological coloration of demands for 
land reform. Ten documents covering the period from 1891 to 
1952 are analyzed: 


Vatican 
Soviet Union 

Vatican 

Vatican 

France 

Hungary 

Bulgaria 


Papal Encyclical, *T)e rerum novarum” (1891) 
Lenin, seven pamphlets and speeches (1913- 
1919) 

Papal Encyclical, “Quadragesimo Anno” (1931) 
Pope Pius XII, Pentecost message ( 1941 ) 
Tanguy-Prigent (Socialist and later Minister of 
Agriculture), ‘TD^mocratie k la terre” (1945) 
Andras Sand6r, “Land Reform in Hungary” 
(1947) 

Vulko Chervenkov (Secretary of the Central 
Committee of the Bulgarian Communist Party, 
later Prime Minister), speech, “Tasks of the 
Co-operative Farms” (1950) 
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Italy Government statement on the need for land re- 

form, ‘Xa Relazione Ministeriale"" (1951) 

Italy Giuseppi Medici (Head of Ente Maremma, later 

Minister of Agriculture), "11 Contratto con i 
Contadini’' (1952) 

East Germany West German Bundesministerium fur gesamt- 
deutsche Fragen, pamphlet, ""Auf dem Wege 
zur Kolchose" (1952) 

These documents are searched systematically for symbols of 
identification, of demands for certain values, and of resistance to 
other values. Dr. Dovring tabulates the frequency of the symbols, 
grouped into themes; determines their function (that is, whether 
they are symbols of identification, demand, or resistance); and 
notes whether their contexts are favorable or unfavorable. 

As far as her representative model is concerned. Dr. Dovring 
writes that the ten documents have two things in common. First, 
“they are regarded as responsible statements justifying agrarian 
policy in practice m the respective countries after the Second 
World War.’" And, second, “they claim to deal with agrarian or 
social questions, but at the same time they are all living state- 
ments of current ideologies in conflict today.”®^ Hence, the mes- 
sages “represent” ofilcial opinion as well as oflScial and unoflScial 
ideologies. 

Such a representational model poses a number of questions. 
First of all, the principle underlying the selection of particular 
documents is not stated anywhere. This is particularly noticeable 
with respect to Communist proposals. Of all postwar statemei^ts 
on land reform in Eastern Europe (such as those cited in oA^ 
parts of the book of which this study forms one chapter), for ex- 
ample, why were those by Sanddr and Chervenkov rather th^ 
others included in this survey? Granted that they are authoritative, 
are they also representative of land reform measures in Polanjd, 
Czechoslovakia, or Rumania? Similarly, is it true that the best 
statement of East German land reform measures and proposals is 
a pamphlet published by a West German government agency? 
Why not Wilhelm Pieck s speech in 1945 entitled “Junkerland in 

Karin Dovring, “Land Reform as a Projpaganda Theme,” in Folke Dovring, 
Land and Labor in Europe, 1900-1950: A Comparative Survey of Recent Agrarian 
History (The Hague: Martinus Nijhoff, ,1956), p. 270. 
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Bauemhandl” or Walter Ulbricht's lengthy chapter on "The Demo- 
cratic Land Reform^?®^ 

Second, is there any measure of functional equivalence among 
the different messages? That is, do they serve the same function 
in all the societies included in the survey? Is it realistic to compare 
a series of statements by Lenin during prerevolutionary and 
revolutionary times with an Italian government statement on the 
need for land reform? Since three of the ten documents stem from 
the Vatican, it seems legitimate to question what role the Vatican 
plays in European land reform. How much influence does it exert 
over individual (noncommunist) governments? Or are the three 
messages included to give an estimate of a changing mood in 
European intellectual circles? 

Third, it seems necessary to look closely at the function of a 
particular message in its society. Put anoAer way, why did the 
communicator transmit the message? Was it merely to announce 
a new policy generally acceptable to the population at large? Or 
was it to persuade intransigent opponents of the need for such a 
policy? Or was it an instrumental message designed to achieve 
other ends (e.g., promising long-run support to land-hungry peas- 
ants in exchange for their support of other controversial meas- 
ures)? Perhaps the clearest question arises here with respect to 
Lenin s statements, four of which were pamphlets written in 1913 
and directed to intellectuals, and the other three of which were 
statements made to elements of the peasantry in the desperate 
years of struggle, 1918-1919. It may turn out that several different 
messages communicated by a single individual are more similar 
(regardless of intent) than messages emanating from different 
communicators; for studies of this sort we need some indication of 
the magnitude of these differences. 

McClelland: The Achieving Society, McClelland analyzed the 
content of children's stories from 23 countries for the period 1920- 
1929 (centering around 1925) and from 41 countries for the period 
1946-1955 (centering around 1950), using the analytical frame- 
work developed for projective tests to measure the need for 
achievement (n Achievement), for aflSliation (n Affiliation), and 


^Wilhelm Pieck, **Junkerland in Bauernhand!” speech on September 2, 194^, 
in Yolkszeitung (Dresden), September 6, 194J; Walter Ulbricht, **Die demokratische 
Bodcnrcform,” in 2«y Geschichte der neuesten Xeit (Berlin: Dietz, 1955), Vol. I, part 
1, pp. 208-38. The use of a West German source, by the way, puts Dr. Dovring in 
the awkward position of having to invert all her findings on East Germany. 
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for power (n Power). Data on n Achievement levels in these coun- 
tries were then correlated with two indices of modem economic 
growth. 

In the process of collecting the readers, four types of bias were 
introduced into the sample. With respect to the first three, the 
level of bias could be reduced significantly simply by gathering 
samples of stories from the countries now missing in McClelland's 
list. ( 1 ) At the outset, for reasons that are not made clear, Mc- 
Clelland decided to exclude from his sample countries lying in 
the tropics. This specification systematically eliminates all the 
Caribbean and Central American republics, most of the South 
American countries, most of Africa south of the Sahara, and all 
of Southeast Asia. (2) Communist Bloc countries are under- 
represented: the sample includes only the Soviet Union, Poland, 
Hungary, and Bulgaria; it excludes, most likely due to the un- 
availability of data, Czechoslovakia, East Germany, Rumania, 
Yugoslavia, Albania, Communist China, North Vietnam, North 
Korea, and the Mongolian People's Republic. (The ratio of popu- 
lation in the former to the latter group of countries is approxi- 
mately 1 to 3.) (3) There is a bias in favor of economically ad- 
vanced countries. Perhaps this was unavoidable, since levels of 
education appear to be correlated with levels of economic growth. 
But, as it turns out, the average gross national product per capita 
in 1961 for the 41 countries (including Brazil, the analysis of 
whose readers was, according to McClelland, open to “coding 
bias”) was $668, and for some 81 excluded countries $230, or 
about one third as much; the percentage growth in GNP per 
capita from about 1950 to 1960 was 3.41 per cent for 34 mcluded 
countries for which we have data, as opposed to 2.54 per cent for 
34 excluded countries for which data are available.®* Since Ae 
focus of McClelland'^s study is the relationship between a societal 
need for achievement and that society's economic development, I 
cannot help but believe that this particular sampling bias severely 
limits the value of the findings, as well as the predictive value of 
his main hypothesis. 

The fourth type of sampling bias cannot be rectified without 
scrapping all of McClelland's data and starting anew. This bias 
stems from the means used to collect the readers. After looking 

Data from Bruce M. Russett and Hayward R. Alker, Jr., Karl W. Deutsch, and 
Harold D. Lasswell, World Handbook of Political and Social Indicators (New Haven: 
Yale University Press, 1964), pp. 149-61. 
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unsuccessfufly in the Library of Congress for such readers, Mc- 
Clelland wrote to the ministries of education of the various coun- 
tries, asking for three “widely-used” readers dating from 1925 and 
1950. Where responses were not forthcoming, he relied upon book 
dealers in the countries and upon private sources. Such a sampling 
process may have been dictated by necessity, but we must be 
absolutely clear about the fact that it could not produce (except 
by the sheerest of accidents) a random sampling of readers used 
in the countries during those time periods. We have no way of 
knowing either how representative the selected readers are or how 
“widely” they were used. And information of this sort is vital for 
an evaluation of the project’s findings.^® 

How appropriate are childrens readers for an assessment of 
societal values anyway? As will be seen, not even McClelland is 
sure of the answer to this question. In fact, in the paragraphs that 
follow I shall use his arguments and counter-arguments exten- 
sively. The difference lies in the conclusions we reach. 

McClelland first of all rejects the simple notion that the stories 
in the readers “represent” solely characteristics of their author s 
personalities. While recognizing that this may be true in part, he 
sees the author not as a creator but as a mediator. The author 
transmits aspects of the culture to a particular audience--“children 
and the adults having to do with the education of children who 
will decide whether their stories will be included in the textbooks 
or not.”*^ Such a position raises two problems. First, there are 
several points in the process of transmitting values at which errors 
of one sort or another can creep in. The author, for instance, has 
a wide stock of folklore available to him when he sits down to 
write. On what basis does he make his selection of stories to 
retell? Can we safely conclude that his vision of cultural values 
is reasonably accurate? His manner of writing may emphasize 
some values in a story and relegate others to a minor role. What 

^ It is significant in this regard that McClelland himself reveals a tendency to 
question the representativeness of the readers when particular data do not fit his general 
findings. When, for instance, he finds that Poland is an overachiever despite low n 
Achievement scores in the readers, he writes: “A recheck of the readers used . . . 
suggests that they may not have been representative,” since trade books of children’s 
stories rather than schoolbooks had been used. When he finds that Turkey has rapid 
economic growth, high reader n Achievement levels, but low n Achievement among 
managers, he writes: **We can reasonably infer that the schoolbooks may be atypically 
high in n Achievement.” David C. McClelland, The Achieving Society (Princeton: 
Van Nostrand, 1961), pp. 101, 266. 

Ibid., p. 71. 
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criteria of selection are used by those who must decide to adopt 
or not to adopt for classroom use a particular set of stories? As 
Los Angeles teachers learned in the decade after World War II, 
when they decided to adopt UNESCO readers for their students, 
the question of the criteria of selection can become a very touchy 
issue in the community at large. Second—and this may be a liiinor 
consideration— the process by which textbooks are prepared for 
school use is more complicated than McClelland suggests. There 
are sw(^al factors that help to determine what readers a com- 
mittee of the school board may choose among; contracts between 
publishers and particular writers; copyrights and copyright in- 
fringements; production costs; marketing considerations; regional 
differentiation (e.g., how likely is it that a textbook manufacturer 
will be able to sell to a school board in Mississippi a reader which 
pictures white and Negro children playing together?); personal 
relationships between salesmen and school board members in par- 
ticular areas; and so forth. Even the reader most representative of 
cultural values can easily fall into a trap somewhere along this 
route. 

McClelland then raises “the theoretical issue as to whether 
fantasy reflects what a person has or doesn't have."®* Although 
there are good reasons why either alternative should be true, I 
am not aware of any research that settles the question finally. 
Research on n Affiliation performed by McClelland's associates is 
instructive in this regard* They asked two groups of college 
freshmen— the first comprising men who had just been accepted 
into social fraternities, and the second consisting in men who had 
been rejected despite their desire to join a fraternity, and who had 
afterward expressed their disappointment to the dean— to write 
short stories describing what was probably happening in a pictj^e 
flashed before them on a screen. It turned out that the level of 
n Affiliation in stories by rejected subjects was almost twice as 
high as that in the stories of their more socially accepted col- 
leagues. These data cannot answer the question conclusively, but 
they do suggest that social rejection is associated with n Affiliation 
(unless there is an intervening variable that accounts for both). 
Whether this relationship— non-affiliation (social rejection) and n 
Affiliation— holds true for a person's sense of having achieved 
something and his n Achievement score is something that remains 


Ibid., p. 76. 
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to be seen. As McClelland points out, “it is impossible to decide 
on theoretical grounds which of ibese two alternatives is most 
likely.”^ 

Another major problem pertains to the values themselves as they 
axe portrayed in Ae children's readers. Even if we assume that the 
author has played the role of cultural bridge properly, we must 
ask, along with McClelland, “of what or of whom” the values are 
typical. Do they represent values typical of the culture as a whole 
or of specific subcultures (e.g., the intellectual elite)? Do they 
represent the values actually held by most of the people in the 
culture or just the "Tbest” values that they want transmitted to 
children? Or do they even comprise a set of values that a ministry 
of education is trying to inculcate in a population? Given ibe 
sampling process used, there is no simple answer to such ques- 
tions. McClelland has pointed to several problematical examples— 
the fact that Algerian and Tunisian readers, although dealing with 
North African themes, were printed in Paris; the fact that Soviet 
readers of the 1920's dealt with values clearly not held by the 
masses of peasants; the fact that Argentine readers in the post- 
World War II years had “a very strong political slant in that most 
of the stories glorified the then dictator Juan Peron”^— and it is 
not difficult to think of others. American civics texts, for example, 
emphasize the value of individual political participation but, as 
may be seen by the level of participation (other than voting) in 
any national or state or local election, this value is not shared in 
practice by most Americans. 

McClelland's efforts to date to discover of what or of whom the 
readers are “typical” in a national culture have not been very 
satisfactory. A national sample survey of Catholic and Protestant 
students in the United States offered some confirmation of the 
thesis that values in readers are typical of more generally held 
cultural values. Less representative but cross-national surveys, 
however, have held out less hope. In countries where readers were 
low on the n Achievement scale, students scored high in projec- 
tive tests for levels of n Achievement, and vice versa. McClelland 
rejects the conclusion that such findings cast doubt upon the 
validity of reader n Achievement scores as indicators of cultural 
values. Instead, he suggests that these findings show that 

p. 101. 
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reader scores may not reflect n Achievement levels in any group of in- 
dividuals in the country: in this sense any comparison with individual 
scores is invalid or unrepresentative. Rather, the reader stress pn achieve- | 
ment may represent something more like **national aspirations ** | 
tendency of people in public (e.g., in children s textbooks) to think about j 
achievement.®^ 

In short, after data or their absence have failed to confirm other 
interpretations of his representational model, McClelland falls 
back upon the conclusion that the readers must represeiit the 
totality of the culture which produced them. 

But McClelland is obviously not happy vdth this conclusion. His 
final statement on his representational model is illuminating; 

Comparison of reader n Achievement levels with levels obtained from 
individuals has raised some interesting questions as to just what the read- 
ers are measuring. It has even thrown some doubt on whether they are 
measuring anything of importance, but in the end, the proof of the 
pudding is in the eating: do they enable us to predict which countries 
will develop more rapidly economically?®* 

That the readers do enable such predictions—at least to McClel- 
land's satisfaction, if not always to that of others— does not get 
around the fact that his argument begs the key question of the 
representational model. 

Angell: Social Values of Soviet and American Elites. The sj^dy 
focuses on social values held by segments of the Soviet and 
American elites, with particular attention paid to values important j 
in the foreign policy making process. Values are defined as per- i 
ceptual images, that is, elements “of the good life as seen by the 1 
person who cherishes” them Of the six elite groups identified as | 
being most relevant for the two societies, four are fairly compara- 
ble: military, scientific, cultural, and labor elites. The remaining 
two elite groups in the Soviet Union are the govemment-Party 
elite and the economic elite, in the United States the cosmopolitan 
elite and the provincial elite. Angell selected the following publi- 
cations as representative of the various elites; 

p. 79 1 italics added. Later on, McClelland adds that ‘*the 19^0 finding 
suggests . . . that n Achievement levels in children’s readers are more of a reflection 
of the mood or motivational level of a nation at the time than an educational influence 
which is affecting the next generation.” Ihid., p. 101. 

^Ibid.y p. 79. 

Robert C. Angell, ‘'Social Values of Soviet and American Elites: Content Analysis 
of Elite Media,” Journal of Conflict Resolution, VIII (1964), 330. 
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United States 
Cosmopolitan 
Provincial 

Labor 

Military 

Scientific 


Cultural 
Soviet Union 
Government-Party 
Economic 

Labor 

Military 

Scientific 

Cultural 


New York Times, Fortune 
Nations Business, American Bar Associa- 
tion Journal 
American Federationist 
Army, Navy, Air Force 
Science, American Scientist, GeoTimes, 
American Institute of Biological Sci- 
ences Bulletin, Chemical & Engineering 
News, Physics Today, Bulletin of the 
Atomic Scientists 
Saturday Review, Harpers 

Pravda, Kommunist, Voprosy Filosofii 
Voprosy Ekonomikii, Sovetskaia Torgovlia, 
Planovoe Khozaistvo 
Sotsialisticheskii Trud 
Krasnaia Zvezda 

Vestnik Akademii Nauk, Vestnik Vysshei 
Shkoly 

Novyi Mir, Literaturnaia Gazeta, Teatr 


The period covered is from May 1, 1957 to April 30, 1960—a thre^ 
year period of relative peace and quiet in Soviet-American rela- 
tions. The frequency of positions taken by or attributed to the 
various elite groups in these publications was tabulated, inter- 
country variations in positions were systematically examined, and 
some attention was given to intra-country variations in value 
positions. 

Three types of problems arose in the selection of the sample. 
First, there was the question of the size of the sample for each 
publication. Angell decided to allot ‘Voughly equal reading time 
... to the periodicals for each of the six elites,’" but, if anything, 
more eflFort should be put into analyzing Soviet journals since ‘we 
felt we already knew much more about American than about 
Soviet society.”®^ He does not give information about the reasons 
for the particular sampling design chosen— e.g., every 22nd daily 
issue of Pravda but every 43rd daily issue of the military journal 
Krasnaia Zvezda, A second problem was the paucity of value 


Ibid,, p. 33^. 
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statements in some sources. This was particularly the case witl^ 
American scientific journals: 

It was not intended originally to use so many of them, but when it be- 
came apparent tliat Science and The American Scientist were going to 
yield so little, it was necessary to find more specialized scientific periodi- 
cals that had some editorial or editorial-like material. The Bulletin of the 
Atomic Scientists is much the richest in the kind of material desired, 
but only a small sample was taken here because the scientists who - sup- 
port the journal are regarded by other scientists as not wholly representa- 
tive of the profession.®® 

Third, the question of what to read was important. In some 
American cases (e.g., the New York Times) it was thought suflS- 
cient to analyze only editorials; except for specific categories of 
items (e.g., obituaries, "articles of an exclusively historical char- 
acter not giving value preferences in the period studied”), every- 
thing in the Soviet publications was included in the analysis. 

The representational model in Angell’s study is, in essence, 
merely an extension of the "prestige papers” idea, except that it 
explicitly rejects any inferences about the readers of the journals. 
Angell nonetheless compounds the problem faced by Pool and 
others by assuming that individual publications are representative 
of separate elites in Soviet and American life. Even if we are 
willing to accept the New York Times as indicative of "elite atti- 
tudes,” we may well be unwilling to agree that the American Bar 
Association Journal is indicative of any attitudes other than those 
of the men writing its editorials. It may still be possible to view 
the distribution of values in all the American publications taken 
together as somehow an indicator of values held by a broad 
stratum of American elite groupings; similarly, the entire collec- 
tion of Soviet journals may give us a better idea of Soviet elite 
values than would one "prestige paper” by itself.®® 

AngelFs analysis raises an interesting question about intra- 
country variations in totalitarian societies. He writes: 

enough free play has developed near the top of Soviet society since the 

Ibid, I this statement underlines the necessity for making the research and sam- 
pling design more explicit than was done, for it suggests that the journals were weighted 
somehow according to their “representativeness.” 

It is interesting in this regard that Angell gives less space to intra-country than 
to intercountry variations; his data nonetheless present interesting possibilities for 
analyses of within-nation differ'^nr*-** 
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death of Stalin for elite difFerences of value to come to light in Soviet 
periodicals. It is true that the explicit diflFerences on the Soviet side are 
not striking, but in a good many of our value dimensions they are real 

To what extent may we expect publicly expressed attitudes, be- 
liefs, and values to be diflFerent in various Soviet ‘"elite*" publica- 
tions? If we expect uniformity (the monolithic hypothesis) and 
discover difiFerences, must we revise our estimate of an enslaved 
press? Or if we expect differences (the pluralist hypothesis) and 
find them, is oiu: expectation confirmed? Unfortunately, if all we 
have access to is Angell’s data, we must respond negatively to 
both these latter questions. What is needed to test the alternative 
hypotheses is time-series data to establish the changing limits of 
acceptable differences in a totalitarian society: we need to know 
whether the level of intra-country variation is now greater, lesser, 
or about the same as it was during Stalin s heyday. 

Finally, in reviewing AngelFs study, we must ask if it is proper 
to accept for analytical purposes views attributed to one elite by 
members of another. BdFore we can consider such evidence, it is 
necessary to know, for instance, how reliable an estimate of “the 
military mind” is likely to be found in the editorial columns of 
the Bulletin of the Atomic Scientists, or how accurately writers in 
GeoTimes will reflect the mood of Americans cultural elite. For- 
tunately, Angell clearly recognizes the danger of misinterpreta- 
tion due to such attributions, and even keeps the attributed value 
positions separate from the direct assertions in his tables and 
analyses. 

Validating Representational Models: A Flea for Future Research 

The bulk of my remarks to this point have been critical. In con- 
centrating on the weaker aspects of some of the recent cross- 
national content analyses, however, I do not mean to suggest that 
the analyses themselves have been without merit. But the fact 
that their results have been both interesting and fruitful in terms 
of generating hypotheses about political behavior does not hide 
the fact that their theoretical underpinnings and assumptions have 
often been insufficiently examined. Perhaps the time has come for 
content analysts to look again at their research tools, just as survey 
researchers in the 1940* s and early 1950*s turned their attention to 


Ibid,, p. 335. 
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some procedures that many of them had come to take for granted. 
The following list of specific and feasible research tasks is cettainly 
not exhaustive, but it may serve as a beginning. 

1. We need intensive analyses of a wide variety of publications 
in each country that is of interest to us. What we want to know 
is the range of attitudes, perceptions, and values expressed in 
these media on a variety of variables over time. If we had such 
information, it would be possible for a researcher to specify that 
for a particular analysis he wants to examine a publication that is, 
let us say, pro-labor but otherwise conservative on economic mat- 
ters and liberal politically; or a conservative political journal with 
avant garde attitudes on culture; or a newspaper that, on a given 
range of variables, is “typical” of all newspapers in the country; 
or a journal that became progressively more liberal over time. 

2. It would be possible to compare such empirically-based de- 
lineations of media characteristics with the ratings of knowledge- 
able judges on the same dimensions. For some purposes it may 
turn out that judgmental ratings are suflSciently accurate; or we 
may find that judges with certain types of background are most 
qualified to rate the press on certain dimensions. 

3. We need to pay closer attention to readership surveys, such 
as those conducted for advertising purposes, to get an idea of the 
audience of particular publications. It should be possible, for 
instance, to find out approximately how many people with high 
professional and socio-economic status report that they read the 
TSIew York Times, "Nation, or Life. Looked at from the other direc- 
tion, it should be possible, after getting estimates of the number 
of members of a particular elite grouping, to determine what per- 
centage of that number reads a particular journal. 

4. Closely related to this is the need to utilize both intensive and 
extensive survey research to determine the extent to which 
people’s views— attitudes, perceptions, values— parallel those pre- 
sented in the publications they read regularly. Note that here I 
am asking for the delineation of an empirical relationship rather 
than inquiring into causality (i.e., does the reader read Ae pub- 
lication because it reflects his views? or does the periodical shape 
his views? or is the true relationship a bit of both?). 

5. It would be useful to have corroborative evidence of an 
aggregate nature for a content analysis. Examples of such evidence 
might include public opinion anaylsis, elite interviews, content 
analysis of other types of messages, and other indicators of b^ 
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havior,®® Ideally, of course, for any research project we would 
like to use all indicators available, bringing them to bear upon the 
aspect of behavior in which we are interested. For instance, if we 
are interested in the development of Western European attitudes 
toward arms control and disarmament, all the above types of evi- 
dence could serve as independent indicators of different aspects 
of Western European decision-making processes. 

6. Turning from inferences about antecedent events (WHY, 
WHO) to inferences about consequent events (WHOM, WHAT 
EFFECT), there is a crying need to integrate social psychological 
data on attitude change with the theory of content analysis. What 
is the function of an attitude for an individual? What conditions 
maximize the impact of a written communication on a person's 
attitudes, perceptions, and values? (For example, what role does 
the person's previous level of informtaion play? Or his commit- 
ment to a particular ideology?) How important is written com- 
munication relative to an individual's face-to-face communications 
network? 

7. It should also be possible to undertake intensive interviews 
with a wide variety of elite groupings to ascertain the extent to 
which their members report that they are influenced by particular 
media. Do they look to these media for authoritative information 
and ready-made views? What other sources influence them? Are 
the particular media in which we are interested more or less im- 
portant than the other sources of information and attitudes? 

8. For cross-national research the question of functional equival- 
ence is crucial. The task is to find sets of messages that perform 
approximately the same function in all the societies included in 
the analysis and that are capable of being analyzed using the same 
set of research tools. Some differences are ones of format: The 
New York Times and The Times of London have clearcut editorials 

One approach is suggested by Ole R. Holsti and Robert C. North, ^Comparative 
Data from Content Analysis: Perceptions of Hostility and Economic Variables in the 
1914 Crisis,” in Richard L. Merritt and Stein Rokkan (eds.)» Comparing Nations: The 
Use of Quantitative Data in Cross-National Research (New Haven: Yale University 
Press, 1966), pp. 169-90, But in their comparison of perceptual data on conflict with 
'"hard” data such as the flow of gold, stock market fluctuations, and commodity futures, 
other questions arise. If it is true that there is a strong correlation between the per- 
ceptual and the other data in the six critical weeks that led up to the outbreak of 
World War I, is it also true that perceptual data on conflict would correlate with the 
"hard” data in times of peace or in times of a severe financial crisis? If there is a 
high degree of correlation in all such circumstances, why bother with the arduous 
task of collecting the perceptual data? Using Lazarsfeld’s concept of the interchange- 
ability of indicators, would it not be simpler (and cheaper in terms of research time 
and resources) to use the "hard** data as indicators of perceived conflict? 
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on a variety of issues, for instance, whereas Le Monde usually 
carries only one editorial (on foreign policy) and the Frankfurter 
allgemeine Zeitung relies mainly on signed, editorialized news 
columns. Other diflFerences are more fundamental. What i$ the 
level of party partisanship in the press? What is the national ethic 
about an independent and “objective” press? How widely read by 
the opposition are “prestige papers”? What eflfect does the literacy 
level of a country have upon the nature and influence of its press? 
What is the difference between “elite” and “mass” newspapers in 
different countries? What is the difference in press attitudes be- 
tween countries with regional newspapers and countries witji na- 
tional newspapers (e.g., between the United States and Great 
Britain)? How comparable is the press in totalitarian and demo- 
cratic societies? What are the bounds of permissible disagreement 
in the different media of a communist state? What effect does a 
change in government have upon the press of various countries? 
Questions such as these, I might add, are fairly simply answered 
for Emopean countries, but grow extremely complex when’ we 
try to account for media in non-Westem areas.®® 

The research tasks that I am proposing are basic. They will 
not produce glamorous results that will dazzle the eyes of our 
colleagues and lay a golden path to foundation support. But they 
will serve to give a firmer foundation to cross-national rese^ch 
using content analysis. 


Another severe problem is the functional equivalence of units of analysis (e.g., 
words, which may have different meanings over space and over time). 
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Introductory Note 

**la carridre ouverte aux talens* --Napoleon 

The author oflFers seven statistical techniques applicable ii| 
political science. Factor analysis is a device for data analysis whicl^ 
has already received considerable attention in political sciencd 
research/ Power spectrum analysis appears applicable to research 
involving observation of data in a time series, such as electionj 
returns.^ Cluster analysis or numerical taxonomy may be £^plied| 
to the testing of hypotheses regarding influence relationsmps in 
groups.® General linear hypothesis is a technique for discerning 
causd inferences. It may, the author notes, test the hypothesis that 
religion is a significant variable in explaining political contribu- 
tions by corporate executives. To use an example suggested by 
Hayward Alker, it may test the hypothesis that a high level of 
voter participation stimulates a high level of government expendi- 
tures (or the reverse).^ 

In considering statistical devices appropriate for predictive 
models of decision-making, Carl Kossack discusses simultaneous 
linear regression. His economic example, the relation of price 
and quantity, suggests analogous political applications (i.e., 
Alker’s voter participation-government activity nexus). Simultane- 
ous equations are useful here because we are not certain which 
is the independent variable and which the dependent variable.® 

Response surface analysis includes a variety of statistical pro- 
cedures which may be utilized to discern "the best operating con- 

^See, for example, Hoyward R. Alker, Jr., '‘Dimensions of Conflict in the G^eral 
Assembly,” American Political Science Review, LVIII, (September, 1964), 642-5}^. 

S. Sidney Ulmer uses the modified factor analysis developed by McQuitty to analyze 
behavior of judges. See Ulmer, “The Analysis of Behavior Patterns in the United States 
Supreme Court,” Journal of Politics, XXII (1960), 629-53. See also L. L. McQuitty, 
“Elementary Factor Analysis,” Psychological Reports, IX (1951), 71-78, and see his 
“Elementary Linkage Analysis for Isolating Both Orthogonal Types and Typal Rele- 
vances,” Educational and Psychological Measurement, XVII (1957), 207-29. 

® Cf. Donald E. Stokes, “A Variance Components Model of Political Effects,” in 
John M. Claunch (ed.). Mathematical Applications in Political Science, (Dallas: Arnold 
Foundation Monographs, Southern Methodist University, 1965), pp, 61-85. 

^See the cluster analysis technique employed by Ulmer, “The Analysis of Behavior 
Patterns in the United States Supreme Court,” Journal of Politics, XXII (1960), 
629-53. 

* Mathematics and Politics (New York: Macmillan, 1965), p. 66. 
pp. 68-73. 
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ditions for a system.” The canvassing resource allocation prob- 
lem, discussed by Kramer in this volume, comes immediately to 
mind. The techniques discussed by Kossack are designed to yield 
optimum operational conditions to inform the decision-maker in 
planning and in the use of resources. 

Classification techniques, the final procedures noted by Kossack, 
have a variety of practical applications to politics, particularly in 
the field of administration. TTie author uses the example of an 
admissions problem in a college; evaluating prospective students 
by means of high-school grades and a battery of test scores. The 
problem requires that similar data on a previous population of 
students and actual college performance by these students be cor- 
related in order to estimate the likelihood of the new student 
population completing the college curricula. A useful and realistic 
classification system may be constructed when the model is ap- 
plied to the earlier experience. 



Statistical Analysis, The Computer 
and Political Science Research 

CARL F. KOSSACK 

University of Georgia 
Introduction 

As our society becomes more and more complex, the need for 
sophisticated methods of analysis through which one can study 
problems associated with various activities becomes more 
more evident. This need is present not only in the biological 
physical sciences but also in the social and political sciences, 
fact, one can successfully argue that in the so-called 
the need is increasing at an even more rapid rate than in the 
sciences. Our social and political systems have become increas- 
ingly more complex, with higher order interactions playing ai role l 
today that was unknown in the earlier rural society of the past. 

Since one of the approaches considered by statisticians is that 
of inductive reasoning-making valid conclusions from evidence 
contained in observational data— and since mans understanding 
of most political phenomena is such as to exclude the possibility 
of mathematical or stochastic modeling of the system, it seems 
appropriate to consider in this paper some of the recent advices 
that have been made in statistics, particularly as they relate to 
problems encountered in political science research. If one associ- 
ates with these advance statistical applications the power of 
modem digital computers, it is now quite feasible to consider 
applications which one could not even dream about some ten 
years ago. One should also note that modern computers provide 
a technological bridge which can make available to the researcher 
analytical techniques which far exceed his in-house capabilities. 

In considering the phases through which a scientific investiga- 
tion usually advances, one possible classification of these phases 
would be: 

The Data Analysis Phase— During this phase of a scientific in- 
quiry, the researcher acquires observational data associated with 
the phenomenon being studied and ^processes’' these data iii an 
attempt to discover important or interesting relationships within 
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the data as well as to screen the data to eliminate errors along 
with non-discriminating variables or measnremente. One of the 
most important statistical aspects to be considered during this 
phase is that of the sampling plan to be used during the collection 
of the data, since one not only is interested in having his data 
representative of the general situation but is also interested in the 
eflBciency of his data collection procedures. 

The Hypothesis or Relationship Testing Phase— After analyzing 
the initial set of data, the researcher progresses to the stage where 
he can generate hypotheses with regard to the phnomenon under 
study. Since these hypotheses generally were acquired using 
available observational data, it is required to test statistically such 
hypotheses. To do this it is generally necessary for one to develop 
an experimental design or sampling plan from which one eflBci- 
ently acquires new data upon which such statistical tests of the 
hypotheses can be generated. 

The Decision-Making Phase— Once a phenomenon is well enough 
understood, at least as far as its role in applied systems is con- 
cerned, interest centers on the use of this understanding of the 
phenomena to improve decision-making capabilities. In statistics, 
this type of activity falls under the general category of statistical 
decision theory as applied to complex systems. The natural out- 
growth of system decision-making is that of stochastic modeling 
of the system, including the study of such models through the use 
of simulation techniques. In fact, many individuals feel that the 
ultimate objective of research is the formulation of a stochastic or 
probability model of the phenomenon under study and the use of 
such a model to improve one’s decision-making capabilities. 

It is not the intention of this paper to consider in depth the 
nature of scientific study, but simply to note the natural phases 
through which many such statistically oriented studies evolve. 

The Role of the Computer 

It has been generally recognized that the modem digital com- 
puter has greatly enhanced mans ability to process data and to 
do scientific computing. In fact, it may be conservatively stated 
that in the last twenty years computational power has increased 
six orders of magnitude, indicating that in some respects what- 
ever data processing was evolved in the 1940’s can now be done 
a million times faster. Of real interest is how this increased capa- 
bility will affect our scientific analyses, in the future. This becomes 
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especially important when one realizes that most problems of 
interest are multivariate, often involving the time variable. Until 
recently, most analyses were unable to cope adequately with such 
dynamic multivariate problems, and thus the researcher had to be 
satisfied with analyses that involved fairly restrictive assmri^tions 
It is only natural to expect that some of the increased coinputa- 
tional power available in the modem hi^-speed digital computer 
will be harnessed so as to reduce dramatically these restnctivej 
assumptions. | 

Still another aspect of modem computers is their ability to usd 
internally stored programs. This means that sophisticated analyses^ 
once they are programmed, can be made available to individuals 
all over the world. This capability should enable us to bridge the 
technological gap that exists between modem theory and practice. 
It is to be expected that the emerging sciences should find it 
possible to take advantage of theories and techniques thaf have 
been evolved in the more established fields without having to go 
through the long evolutionary mathematical process required in 
the past. 

These two aspects of modem scientific computers challenges 
one to consider how advanced statistical techniques may be intro- 
duced into scientific disciplines in a fashion that would enable one 
to apply these new techniques to his particular research problem 
with the minimunii of diflSculty. 

Advanced Statistical Applications 

I have reviewed the recent statistical Kterature and have selected 
from the new techniques found in the literature those which 
appear to be most promising, keeping in mind the dyn^tmic- 
multivariate nature of most applied research problems and the 
capability of modem digital computers. In the remainder of this 
paper, I would like to discuss briefly seven such advanced tech- 
niques. In these discussions, I will try to follow a regular pattern; 

(a) the type of problem for which the technique is appropriate, 

(b) how the technique "solves” the problem, (c) what are the 
general advantages in utilizing a computer when applying the 
technique, and (d) a small list of appropriate references. 

The techniques selected are: 

Data Analysis 

I. Factor Analysis 
II. Power Spectmm Analysis 
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Hypothesis Testing 

III. Cluster Analysis — Numerical Taxonomy 

IV. General Linear Hypothesis 
Decision Making 

V. Simultaneous Regression 

VI. Response Surface Analysis 

VII. Classification Techniques 

I. Factor Analysis 

In factor analysis, one is concerned with how to account for the 
observed correlation among all the observed variables associated 
with the phenomenon under study in terms of the smallest number 
of factors and the smallest residual error. In many respects, the 
problem considered by a factor analysis is that of attempting to 
reduce the number of variables needed to describe a phenomenon. 
It is recognized that if one indiscriminately adds more variables 
to his observational vector the law of diminishing return sets in, 
and soon one is losing rather than gaining information because of 
the noise introduced by the additional variables. Thus, one would 
like to find a new set of variables (factors) which are essentially 
uncorrelated, each of which adds significantly to the information. 

In the analysis, no distinction is made between so-called inde- 
pendent and dependent variables, since prediction is not a con- 
sideration. Thus, while in regression the constants found as regres- 
sion coefficients are merely constants used in the prediction, in 
factor analysis the constants obtained suffer from the demand that 
the weights they give to the derived variables must admit to inter- 
pretation and the derived variables must have a scientifically 
meaningful interpretation. Fundamentally, the object is to dis- 
cover whether the variables can be made to exhibit some under- 
lying order that may throw light on the processes that produce 
the individual differences shown in all the variables. 

In a factor analysis, the basic mathematical model is 

S ji = CjlXli -|- Cj2X2i -j- • • • ”1“ GjqXqi “j” . , . CjqXqi 

where Sji is the “score” (measure) made by the i^ individual 

on the j"" “test” (variable) 

Xqi is the measure for the i*^ individual on the factor 
(uncorrelated reference ability) 

and Cjq is the weight given the q**' factor relative to the 
variable 
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To determine the weightings, the Qq’s, the correlation matrix of 
the original observational variables is "factored " The matliemati| 
cal problem associated with this factoring, since factoring is noi 
unique, is to make the factoring such that the smallest nrnnber or 
interpretable factors are used, leaving an insignificant unexplained 
residual. It is evident that such a technique requires judgment on 
the part of the investigator, and, in fact, the several different 
methods of factoring the correlation matrix appear to yield differ- 
ent levels of effectiveness, depending on the particular type of 
problem being considered. 

The most common solution which has been programmed for j 
digital computers is that which is called the principal component i 
solution coupled with an orthogonal rotation of the factor matrix. I 
Input data for such programs usually are in the form of raw data, i 
but may simply be the resulting inter-correlation matrix. Included 
in the output of most computer programs is the initial factor matrix 
which is simply the coefficients Qq and the orthogonal rotated 
factor matrix. The satisfactory nattme of the solution depends upon | 
the ability of the analyst to interpret the factors effectively, eon- | 
sidering their loadings relative to the original variables. 

The power of the .digital computer to perform a factor analysis 
is clearly indicated when one considers that there exists a program 
which will handle up to 80 variables with up to 10,000 cases (in- 
dividuals ) using as little as an hour of running time for this size 
problem. In fact, without a computer, a problem of this magni- 
tude simply could not be analyzed. j 

II. Power Spectrum Analysis j 

One frequently obtains observations that are in the form of a 
continuous or discrete time series. Often such series are pf a 
type that is called stationary. By this we mean that, if a ran- j 
dom sampling is made from the time series with equal times 
between observations obtaining the sequence of observations 
xi,X 2 ,X 3 ,...,Xt...,x„, these x's are such that for all Ps E(xt)=jit, the j 
variance of Xt=Vo, and the covariance (xt,Xt+s) =Vs for all integer j 
s. Of particular importance is the fact that the covariance between i 
two observations Xt and Xt+s depend only on the time separation s 
and not on the dock time t. One should recall that the covariance 
(xt,Xt+s) is defined as 

Cov(xt,Xt+s) =e[ ( xt-/it) (xt+s-fit+s) 


r O' or 
Xt,Xt+6 Xt Xt+6 
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Thus a Stationary time series is such that the correlation between 
two observations does not depend upon where one takes the ob- 
servations but only on how far apart the two observations are. 
We really are dealing with a signal that exhibits periodicity over 
time rather than dynamic change. The problem associated with 
the analysis of such stationary time series is to convert the ana- 
logue signal into quantitative values which will then admit more 
readily to mathematical analysis. 

Since the time series is periodic, it seems natural to assume that 
the signal is a composite of several cosine functions of varying 
frequency and amplitude. Along with these cosine functions one 
assumes that there is superimposed a random noise factor. Thus, 
one can hope to decompose the time series into the significant 
cosine functions and to replace the continuous set of data with a 
finite number of frequencies and associated amplitudes of these 
cosine functions. These frequencies and amplitudes can then be 
used in pattern recognition type problems so as to be able more 
readily to recognize patterns in the signal and to be able to dis- 
tinguish between signals coming from different underlying con- 
ditions or sources. 

In solving this decomposition problem through the use of a 
power spectrum analysis, one formally considers that the signal is 
given as a function of time, say x(t). Then the auto covariance 
function is defined by 

c(t) = QQ Y J *(t + T)dt 

and the power spectrum frequency is given by 

lim 1 TA — i2wft 

i*(f) T->-00T 

Now if the signal x(t) corresponds closely with the function cos 
27rft, the value of P(f) is large, while if the signal fails to corres- 
pond, P(f ) will be small. In fact, if there is actually no correspond- 
ence, P(f ), though theoretically zero, would exhibit some positive 
value due to the random noise entering the evaluation. Pictorially 
we could represent the stationary time series by the following 
figure: 
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The power spectrum of such a time series could be represented by 
the following figure: 



In the above representation, it may be recognized that the original 
signal is a composite consisting namely of three sinusoidal func- 
tions with frequencies fi, f2, and fa with amplitudes whose squares 
(power) correspond to the heights of three peaks exhibited on 
the power spectrum graph. 

In practice, the problem of estimating the power spectrum 
through the use of digital computers requires first that one replace 
the infinite range considered in the theory by a finite range. Now, 
it is apparent that the restriction on the range of the data used 
restricts the frequency range for which one can obtain estimates. 
The lowest practical frequency that can be estimated corresponds 
to one-half the range use. Next, one must replace the continuous 
analogue signal with isolated sampled points. The sampling rate 
used also imposes a restriction on the range of frequencies that 
can be studied, since the highest frequency about which one will 
have information is ttA, where k is the length of time between 
sampled points. Unfortunately, the desire to take a longer range 
of data with more frequent samplings is usually thwarted by lack 
of sufficient data and/or the lack of computer energy to analyze 
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the multiplicity of points generated by a high sampling rate. 

In the computer programming of a power spectrum analysis, 
the dimensions of the program depend on the size of the com- 
puter being used; but for a large-size computer, programs exist 
that can handle up to 20 different series with up to 1,000 discrete 
data points per series. In computing, the autocorrelations up to 
200 lags can be considered. Not only does the output of the pro- 
gram include a plot of the input data, the autocorrelation func- 
tion, and printed and plotted power spectral estmiates, but one is 
enabled to examine the interrelationship between two series by 
having the computer determine and print out the cross-covariance 
and the ^'coherence function ^ for any pair of signals from the 
several series used as input for the program. 

III. Cluster Analysis or Numerical Taxonomy 

In many investigations, the amount of information obtained 
about each individual and the number of individuals studied are 
so large that the investigator finds it diflScult to know where to 
start his analysis. Since one of the purposes of science is to gen- 
eralize, one often would like to group individuals together into 
more or less homogeneous groups relative to the several measure- 
ments that have been made on each individual. At the same time, 
it may be of interest to differentiate between the numerous varia- 
bles as to those which provide discriminatory power in separating 
off such groups. 

A cluster analysis or numerical taxonomy program is designed 
to uncover statistical similarities within the data, to form clusters 
of the most similar cases, and to select those attributes which 
are statistically more important in determining the classification 
derived through this method. With the large mass of data and the 
many observations from each individual, the problem is to identify 
a few main classes of data rather than the many individual cases. 
Technically, one may argue that the data represents a mixture 
of samples from several distinct populations and that one is inter- 
ested in sorting the observations into their respective population 
groups without even knowing how many populations are involved. 
In some respects the technique could be called statistical sorting. 

In most numerical taxonomy programs, the analysis is ap- 
proached by first converting the data into attributes. That is, the 
information is reduced to a set of variables which can take on only 
the values zero or one. It is apparent that both quantitative and 
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qualitative data can be converted into attributes, since the attrib- 
ute concept is essentially the basis behind all information. Thus, 
for categorical type data, each possible class from a given category ! 
can be made an attribute variable with the zero indicating the 
individual is not in the class and the one indicating the individual 
is in the class. In the case of measured variables, the range can be 
subdivided into many subranges and each subrange be considered 
as an attribute variable. 

The analysis then generates a similarity coefficiaat which essen- 
tially indicates which observations have essentially the same 
attribute structure. Thus we may use 

Su=Mu/Nu 

where repersents the number of attributes possessed in com- 
mon by cases i and j, while Nu represents the number of attributes 
possessed by either of them. The similarity ratio could be con- 
sidered as the weighted probability in finding a matching attribute 
between the two individuals for any characteristics selected at 
random. Hence, a very large value of S would indicate a much 
greater degree of similarity between the two cases than is likely 
with random distribution of the attributes. Similarly, a very low 
value of S would indicate a non-random divergence of character- 
istics between the two cases. 

In order to measure roughly how "typical” a case is, a count, Rt, 
is made of the number of other cases with which the case in ques- 
tion has at least one attribute in common. Finally, a measure H 
is made for each case by multiplying together all the non-zero R 
values of S for each case. Thus, 

Hi— Sii X Si2 X Sis X . . . X Si^R 

Here Hi can be thought of as representing the probability, for any 
characteristic selected at random of those attributes processed by 
case i, that all other of the non-zero R cases would also possess the 
attribute. We have for each case two measures of typicality, Ri and 
Hi, where Hi can be considered as a refined measure that could 
be used to differentiate cases with the same R. 

Thus, one may rank the cases first by descending values of R 
and then by descending values of H within each value of R. The 
first case listed can then be considered as the center of the first 
population grouping and other members of the cluster are then 
identified by high values of S in conjunction with this nodal case. 
The problem of where to cut off ihe cluster must be resolved 
either by using judgment or by the ordered listing of (R,H)'s. 
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One may decide that the second cluster center should be taken 
to be the individual who is at some predetermined rank within 
the table. When that case is found in the ordering of S's around 
the first cluster center, the cluster is truncated and a new cluster 
determined around the new center using the S’s associated with 
the remaining cases and the new nodal case. Thus, the clustering 
can continue until all cases are sorted into one of the clusters. 

The entire approach to numerical taxonomy makes the use of 
a high-speed digital computer most appropriate. The many logical 
comparisons and testings made in deriving values for M,N,S,R, 
and H are operations which a computer handles naturally. The 
sheer volume of the operations also requires a processing speed 
found only in electronic computers. For example, if there are only 
100 cases in the collection, then 4,950 values each of M,N, and S 
must be calculated for every analysis made with new or revised 
information. The basic approach described above is quite suitable 
for programming for most types of computers. 

IV. General Linear Hypothesis 

In experimental situations it is common to analyze multiresponse 
experimental data using the approprate univariate analysis on 
each response. However, there is often an interdependency be- 
tween these responses which would be ignored by this single- 
response-at-a-time approach. At the same time, the various uni- 
variate analyses that one may elect to use depending upon the cir- 
cumstances of the experiment, and even when considered sepa- 
rately, involve the analysis of some underlying mathematical 
model, including both the estimation of parameters and the testing 
of hypotheses associated with the model. The “general linear 
hypothesis^' concept combines these mathematical models into a 
single general model enabling one to use a single analytical ap- 
proach to such problems and at the same time to handle the multi- 
response problem. 

It seems best to introduce the general linear hypothesis concept 
in the single response variable case and then to discuss its gen- 
eralization to multi-response type analyses. An attempt will be 
made to show how the general linear hypothesis model includes 
the analysis of variance, regression, and the analysis of covariance 
as special cases. 

In an attempt to show the generality of the model and the power 
of the matrix notation, the generalized linear model concept will 
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be summarized in matrix form and then the model will be applied 
to one or more of the special cases noted above, expanding the! 
expressions into a non-matrix notation. A mathematical model | 
must first be evolved showing how one feels the response variable 
(dependent variable) relates to the design variables (indepen- 
dent variables). In the general linear case, we have the model* 

This model reduces to the following special cases; 

i) The analysis of variance model (simple two-way design) 

yij=/x + Ti + + ey 

where Ti is the i^ treatment eflFect, is the block effect, 
/x=the overall mean and ey the experimental error. 

ii) The regression model (multiple linear regression) 

yi==^ -j- -f- ^2X21 -f- . . . -b jSrXri -}- Gl 

iii) The analysis of covarience model (simple two-way design) 

yy -f- Ti -4- + ^lXi,y *4* ^ 2 X 2 ,y -f" . . . -|- ^rXr.y -f- Cy 

To demonstrate how the general model reduces to these special 

models, consider the regression case. Then + ^ can be 

written in expanded form as; 

yi 1 Xii Xi 

ya 1 X2i Xa 

[ y„ 1 Xni X„ 

Expanding the expression for the i**" element yields the form 
given under (ii) above. 

Associated with any mathematical model is a set of hypotheses 
regarding the values of the parameters f which are the un- 
known constants of interest in the study. In a linear hypothesis 
approach, these hypotheses must be expressible as linear relation- 
ships involving f’s and known coefficients. We thus have in the 
general case the set of linear hypotheses expressed as 

c^=o 

where C is a known matrix of coefficients. 

To illustrate again the appKcability of the general expression to 



*An italic lower case letter indicates a vector, while an upper case letter indira 
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the Special cases, we have for typical types of hypotheses the 
following: 

i) The analysis of variance model 

ri=T2=...=Ti 

The hypothesis that a certain set of the treatment effects are all 
equal. 

ii) The regression model 

^r-s-2 • • • 0 

The hypothesis that the last s independent variables have no 
linear relationship with the dependent variables when consid- 
ered with the other r-s variables, 

iii) Analysis of covariance model 

T i = T 2. . Tt 

^r-s-2 • • • 0 

A combination of the hypotheses introduced under (i) and (ii). 
Before the given set of hypotheses can be statistically tested, 
one must first estimate the parameters, usmg the available ex- 
perimental data. These estimates are given general form by 
^=(A'A)-^A'y 

where the prime indicates the transpose of the given matrix and 
( indicates its inverse. In the special cases these estimates re- 
duce to the familiar least squares estimates. Thus in regression 
n 

A S(y7-yi)(xij-xi) 



j=l 

The testing of the hypotheses is accomplished by using a test 

statistic in the form 

Vfr. - N- SSH/n. 

F(a.,n.) 

where SSH= (C^) [C( c| 
and SSE=t/'t/ - f A't/ 

The statistic F(nh,ne) is the Snedecor F statistic with Uh and n® 
representing the degrees of freedom for the hypothesis and error 
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respectively. An alternative statistic that can be used is the Beta 
statistic d^ned by 

p =SSE/(SSH + SSE) 

The hypothesis is then rejected for significantly low values of )8 . 

Now, if the response is a vector quantity rather than a single 
measure, the generalization is relatively straightforward; since, in- 
stead of considering singly y, we must consider the vector ( 
t/p), and the general linear hypothesis must be written iti the 
form 

(j/lj •••? J/p) ^ ^2, •••, fp) "*}“ ( ^2j •••> ) 

Thus, the vectors of observations, parameters and error tern^s all 
become matrices, and we can symbolically write the model as 

The subsequent analysis of this generalized model generalizes in 
a rather straightforward fashion (See Poston). 

In considering the implications of this theory to computer pro- 
gramming, one should realize that matrix algebra can be program- 
med for computers in a fairly straightforward fashion. In fact, sev- 
eral computer programming languages have been developed utiliz- 
ing vectors and matrices as basic elements in the language ( See 
Bargmann). Thus, the researcher having access to such a com- 
puter program can analyze a large class of experiments without 
having to develop separate techniques for each special case. 

V. Simultaneous Lineab Regression 

Many problems arise that require the construction of a mathe- 
matical model that will represent the operation of a social, politi- 
cal, or economic system. From this model one is interested in 
predicting future events that will follow when one or more Vari- 
ables in the model are changed or determining what policy should 
be followed to give a desired result or outcome in the system. Or 
perhaps the purpose of the model is simply to describe the system 
in mathematical form. 

Given a model, one of the major research problems is to esti- 
mate its parameters. When the model is not explicitly stated, one 
often has to resort to simulation studies to help make reasonable 
estimates; however, in the case of a system expressed by simul- 
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taneous linear equations, the parameter estimates can be mathe- 
matically determined. One of the important models of this type is 
that of multiple linear regression, where one assumes a single de- 
pendent variable is governed in a linear fashion by the levels of a 
number of other "independent” variables. Simultaneous linear re- 
gression can be considered as a generality of multiple regression 
to the following situation. In the multiple regression relation, one 
and only one variable in each equation may be chosen as the de- 
pendent variable, whose changes can be explained by those of the 
explanatory, or independent, variables. Very often this choice is 
arbitrary, since economic and social relationships are not normally 
formulated in a simple manner. A typical example from economics 
would be price and quantity for a product. Surely one would be 
hard pressed to determine which to call dependent if the two 
occur in a set of additional "independent” variables. In fact, even 
if such a designation were made, the multiple linear regression 
model would not be properly estimated. 

In simultaneous linear regression, one is able to introduce as the 
model a system of simultaneous linear equations rather than a 
single regression equation. The system is termed the structural set 
of equations, since each equation relates to a fundamental aspect 
of the phenomena being studied. Each structural equation may 
have more than one dependent variable and a number of inde- 
pendent variables. 

For example, one may have the following five equations in the 
structural set:* 

yi = bijsya + bisys + C13Z3 + C14Z4 + cio 
yi == b23y8 + b25y5 + C 23 Z 3 + C 20 
y 2 — C32Z2 “j- C34Z4 -|- C30 
ya = b44y4 “h C4iZi 4“ C 43 Z 3 -f” 
y 4 ~ bssys 4 “ C53Z3 4 ~ ^50 

where the y s are dependent variables, the zs are independent 
variables, and the b"s and c s are the corresponding regression co- 
eflGicients. 

Since there are several methods by which the regression co- 
eflBcients in such a set of simultaneous regression equations can 
be estimated, computer programs vaiy as to which or how many 
of these methods are included. Thus, we may use: 

♦See M. A. Girshick and T. Haavelmo, "Statistical Analysis of the Demand For 
Food; Examples of Simultaneous Estimation of Structural Equations,” Econometrica, 
XV (1947), 79-nO. 
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(i) Single-equation least-squares where the estimations are ob- 
tained by considering each equation separately, assuming that for 
estimation purposes the first dependent variable in the equation is 
the dependent variable. Usually if this method is used the results 
are obtained for comparison purposes only. 

(h) Two-stage least-squares still uses the single equation ap- 
proach, but in the first stage a correction is made in the estimates 
for all but one of the dependent variables in the equation; and in 
the second stage, these corrected variables are used to compute 
the regression of the remaining dependent variables on all the 
other variables in the equation. 

(iii) Limited-information estimation again uses the single equa- 
tion approach, but, using the concept of maximum likelihood, 
treats all the dependent variables in the equation simultaneously. 

(iv) Full-information estimation considers all the equations 
simultaneously and once again resorts to maximum likelihood 
methods in making the approximation. 

The computer time required for each of these methods increases 
with the sophistication of the method used in obtaining the esti- 
mates, and often one is led to reduction techniques or approxima- 
tions in applying the more sophisticated approaches in order to 
reduce the computer time. 

The use of this approach in the analysis of a system or phe- 
nomena requires that the investigator has enough insight into and 
knowledge of his field to be able to formulate valid relationships 
among the variables to be studied. In the political and social 
sciences, he may thus be faced with a substantial problem of a 
degree of difficulty, since in these fields there is little available in 
the way of general guides. However, mathematical modelling of a 
system is more or less of an art, and the best one can do is to 
study the work done by others in related fields. 

Vi. Response Surface Analysis 

In 1951, Box and Wilson introduced a new concept into experi- 
mental design by recognizing that, in many situations, one is not 
so much interested in testing the significance of factors associated 
with a system, as simply to determine the best operating condi- 
tions for the system. The class of experimental designs introduced 
has become known as "response surface designs,” and the asso- 
ciated analysis yielding the optimum operational conditions is 
known as "response surface analysis.” 
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The behavior of any reaction is governed by laws which should 
be representable in mathematical form, and thus it should be pos- 
sible to determine the optimum conditions for the reaction by 
simply applying these laws. However, one often finds in practice 
that the underlying mechanism of the system is so complicated 
that the mathematical representation using theoretical considera- 
tions is essentially impossible. This is particularly true in the social 
and political science fields. When one is faced with need to use 
the empirical approach and is interested in determining the best 
operating conditions, say, for an economic or political system, these 
response surface designs are appropriate. 

The theory is developed assuming that the response, Y, is de- 
pendent upon n variables Xi, which are capable of measurement 
and control. The form of the functional relationship Y = f(xi, X 2 , 
..., Xn) is unknown, and the problem is to find the combination of 
values of Xi which optimize the response within the region of the 
n-dimensional factor space where experimentation is feasible using 
as few experimental observations as possible. The number of ob- 
servations required will, of course, depend upon the accuracy and 
precision of estimation desired. Where the problem is one of mini- 
mization, it can always be converted to one of maximization; for 
example, by considering the improvement as compared with some 
standard instead of the actual level achieved. 

The technique assumes that the response function can be satis- 
factorily represented by a quadratic form in the area of interest, 
i.e., 

where Y is the property to be maximized, the Xi are the levels of 
the n independent variables (xo = 1), the Cy are the unknown 
parameters to be estimated from the experiment and e is the resi- 
dual or experimental error. The adequacy of the quadratic surface 
representation of the true response surface of the process being 
investigated depends on the use of a small sub-region of the factor 
space within which one restricts his determinations. In some ex- 
perimental situations, such a small neighborhood within which 
the optimum point can be assumed to lie is already known to the 
experimenter from previous experience. However, if this is not the 
case, the procedure of locating optimum conditions involves two 
distinct phases. The first phase involves the location of the neigh- 
borhood, while the second is to determine within the neighbor- 
hood the optimum point. 
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The location of the neighborhood is accomplished by using 
what is called the/'method of steepest ascent.” In this procedure, 
one assumes that the surface can be represented locally by a slop- 
ing plane. Starting at any point, P, the experimenter estimates the 
coefficients or slopes of the plane Y = bo + biXi + ^^00^ + ... bnXn 
by performing a suitably arranged set of trials in a small sub- 
region about P. From these observations, the coefficients are esti- 
mated and one then calculates the direction of steepest ascent or 
greatest slope up the plane. He then proceeds to a point, Q, in 
this direction, where new observations are made, the slopes are 
redetermined, and the process repeated. In this way, by a step-by- 
step procedure, points of higher and higher response are reached. 

This procedure cannot, however, be used actually to reach the 
maximum response point since, as one goes farther up the surface, 
the slopes become more gradual and thus more difficult to esti- 
mate. The second-order terms also become relatively more import- 
ant. The procedure generally followed is to compare the linear 
effects with the error variance and with the second-order eff^ts, 
and if the linear model appears adequate, the path of steepest 
ascent is determined. At the point of diminishing returns, the hew 
point is located around which the process is repeated. 

The experimental design used during the first phase where one 
is seeking the path of steepest ascent from a given point on the 
surface is generally of a two-level factorial type, where the origin 
for each variable is taken at the initial point and the levels used 
are equidistant from it in either direction. Thus, in a three- vari- 
able situation, one would use a 2® factorial design, and the ei^t 
experimental points would be as shown in Table 1. 

Table 1. Experimental Points for a 2® Factorial Design 

Factor Level 


Point 



Xa 

1 

+1 

+1 

+1 

2 

-1 

+1 

+1 

3 

+1 

-1 

+1 

4 

-1 

-1 

+1 

5 

+1 

+1 

-1 

6 

-1 

+1 

-1 

7 

+1 

-1 

-1 


—1 

—1 

— 1 
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The estimation of the b’s from this type of design is straight- 
forward. In fact, if (T* is the experimental error variance, we have 
bi=2xiy/2xi®, and V (b)=cr*/2xi* (the variance of b). 

One thus has the essential ingredients needed to complete the 
first phase of the investigation. 

In considering the second phase, we assume that we have identi- 
fied a point P &at is in the neighborhood of the optimum point. 
The experimental designs used at this stage of the problem are 
known as composite designs. There are two types of composite 
designs, central and non-central. The central composite designs 
consider the 2“ factorial designs and adds additional points with 
high and low levels for each variable as well as additional points 
at the center of the design. 

The central composite design for n=2 is shown in Figure 1. 
The 2® factorial points are given as solid points while the added 
points are open circles. 


Figuke 1. A Two-Dimensional Central Composite Design 



For the purpose of estimating the parameters of the quadratic 
form, the central composite design can be shown to be more eflBci- 
ent than the 3" factorial dseign. As one might expect, this means 
that a saving in experimental points can be realized, since interest 
has been narrowed to estimating the optimum response point 
rather than to studying generally the nature of the mathematical 
model that explains the process under study. 

The location of an optimum point usually requires a series of 
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coordinated experiments, especially when one must jfirst find the 
neighborhood of the optimum. If the process being studied has 
little or no time effect, so that one can combine results that are 
obtained at different time intervals, the series of experiments can 
often be developed into an organized sequential program. The 
non-central composite designs are useful if one uses such a se- 
quential approach to his experimentation. The factorial portion 
and the central point are run first and, if the optimum is found to 
be close to the center being used in the factorial design, the addi- 
tional points required for the central composite design are then 
used. If, however, the optimum response is nearer one of the other 
points the factorial portion is augmented to form a non-central 
composite design. Of course, if it is indicated that a new location 
should be sought through the use of the path of steepest ascent, 
then the sequence is as follows. The fitting of the quadratic 
surface, Y = 2"i=o S” CyXiXj + e , to the observations realized 
from the composite design can be obtained by standard multiple 
regression techniques. Following the estimation of the coefficients, 
one can perform an analysis of variance on the results to establish 
the significance of the several coefficients as well as the signifi- 
cance of the regression itself. If one has some prior information as 
to the value of or®, the experimental error, this information can be 
used in a comparison with the residual mean square associated 
with the regression analysis to provide a test of goodness of fit of 
the second-degree equation. If the fit is not satisfactory, one may 
change his neighborhood if this seems required, or increase the 
order of the regression equation. 

When such a test has indicated that an adequate fit has been 
obtained, the fact that an individual coefficient is or is not sta- 
tistically significant is of no practical significance. What this means 
is that one might just as well retain the small coefficient in his 
future analyses, since there appears to be no really good reason 
for making the hypothesis that one of the coefficients is actually 
zero in the population model. 

When the second-degree equation has been fitted, it is necessary 
to interpret it to see if one can, in fact, determine the coordinates 
of the optimum response point. Since the coefficients in a general 
quadratic do not readily convey to the observer the nature of the 
surface being represented, one usually resorts to a canonical re- 
duction of the equation so as to obtain the canonical foqn, 
Y = bo -f* bllX*l “f“ b22X% “f" ••• bnnX*n. 
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There are many types of surfaces that can be obtained through 
the use of the quadratic function. Under certain conditions, includ- 
ing those where all the b’s are negative, there will be a point maxi- 
mum in all the variables. Another situation, however, that may be 
encountered is where the maximum is in fact remote from the re- 
gion of the design, but the surface is elongated along an axis 
which passes close to the design. This indicates that the previous 
experimentation has brought the experimenter not to a maximum 
but close to a rising ridge of the surface. No conclusion as to 
optimum conditions can be drawn in this latter case, but one can, 
from observation of the nature of the rising ridge, determine where 
additional experimentation should be carried out in attempting to 
locate the optimum point. In the case that the optimum point falls 
within the region of the experiment, its position can be obtained 
by differentiating the original quadratic with respect to the vari- 
ables Xu Xu ••• turn and equating the results to zero. This will 
yield a set of linear equations which, when solved simultaneously, 
give the coordinates of the optimum point. It should be empha- 
sized, however, that the nature of the surface should be critically 
examined through the use of the canonical transformations ap- 
proach before one seeks these coordinates. In fact, as the dimen- 
sion of the problem increases, making a careful examination be- 
comes most important. 

The mechanics of analyzing the data obtained from the se- 
quence of observations made in following the approach outlined 
above can be readily adapted to digital computer programs. In 
fact, many of the procedures make use of techniques for which 
standard computer programs are already generally available. Thus 
in the initial phase, where one is interested in following the path 
of steepest ascent using a linear fit to the experimental data, 
multiple regression computer programs are applicable. These pro- 
grams give not only the best estimates for the regression co- 
eflScients but also their standard errors as well as the standard 
error of estimate for the response variable. Through the use of 
transformations, the significance of the quadratic terms in the 
surface can also be readily tested using the same computer pro- 
gram. This enables one to determine when to abandon the steep- 
est ascent phase of the investigation. 

In the calculation of the actual path of steepest ascent, the 
successive differentiation of the fitted linear relationship yields 
simultaneous linear equations whose solution can be obtained 
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from standard programs for solving systems of simultaneous equa- 
tions. In fact, even the determination of the possible steps up the 
path through the computation of coordinates of the points on the 
path can easily be programmed. 

When one reaches the point of fitting the quadratic surface to 
the data obtained from a composite design, the determination of 
the coefficients of the surface, dieir standard errors and the stand- 
ard error of estimate is also a multiple regression program appli- 
cation. The quadratic terms are simply treated as new linear 
variables in this case. The determination of the optimum point is 
again the solution of a set of simultaneous linear equations. 

VII. Classification Techniques 

The theory of statistical classification deals with the problem of 
assigning one or more individuals to one of several possible groups 
or populations on the basis of a set of characteristics observed 
among them. Thus, the problem of classification can be considered 
as a special c^e ot application of multi-variate decision theory. The 
nature of the observed characteristics may vary from problem to 
problem. In some cases they may be all of a measured type, while 
in another situation the variables may all be of the simple cate- 
gorical type of attributes in which each observation can take on 
but one of a finite number of distinct values or states. Siegel has 
noted that “measurements may, in general, be from four sc^es: 
the nominal, ordinal, interval, and ratio scales. In any given multi- 
variate classification problem, the measurements may be of a mix- 
ture involving some or all of these types of variables.” It should 
be expected that numerous approaches have been advanced as to 
how one should go about evolving a classification decision rule. 

It should be recognized that since the area of interest has been 
designated as “statistical” classification, this means that the deci- 
sion rule must be based upon observational data available from 
samples from the several populations rather than on known popu- 
lation characteristics. Thus we assume that we have a sample of 
individuals from each population and for each of these individuals 
we have available the same set of observations as are available 
for the individual requiring classification. 

Consider for illustration a well-known classification problem, 
that of a prospective student applying for admission by submitting 
credentials such as his high school records and in addition being 
given a battery of admission tests. These data become the multi- 
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variate set of observations available on each applicant The prob- 
lem is to classify, in advance, the applicant into the population to 
which he belongs, where the alternatives are the population of 
those students who can successfully complete college training and 
the population of students who will not complete the college 
courses successfully. Available to the admissions oiBBce are the 
same data on former students, some known to have completed and 
the remaining known not to have completed college. 

Let us now look at the steps required to evolve a classification 
rule. Statistical classification rules, in general, depend either upon 
the concept of likelihood where one considers the ratios of the 
likelihoods that the observation to be classified came from the 
suspect populations, of they depend upon the value of some classi- 
fication statistic whose form is assumed and is evaluated for the 
individual requiring classification. The samples that are available 
from each population are used to estimate the likelihood ratios or 
the constants in the classification statistic, depending on which 
approach is being used. 

There are four major steps that must be accomplished if one is 
to evolve a classification rule, in brief; selection of the variables, 
selection of the classification technique, selection of the decision 
rule, and an analysis of effectiveness. These we now consider. 

The Selection of the Variables to he Used in Making the Classi- 
fication, 

Here one encounters problems such as whether or not to include 
in his observational vector variables of different types, how re- 
liably each available variable can be measured or determined, the 
discrimination power of the variable relative to the populations of 
interest, the inter-relationship of the variables, and the cost of 
making each variable determination. The decisions of selection de- 
pend in the main on personal judgments, since at present no good 
selection rule exists. 

The Technique to be Used in Making the Classification Estimate 
and the Use of Available Sample Data to Make the Estimates. 

One can identify several estimation techniques in the literature; 
however, the best known technique for measured variables is to 
use the Wald Statistic, which is simply a linear function of the 
observations in the form 

W(z) = 2 ~ 

q=l p=l 
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where == general term in the inverse of the common 
covariance matrix and = mean ofxq in population itu 

Selection of the Decision Rule to be Used in Making the Actuat 
Classification Decision for a Given Observation, 

To discuss this step at this stage, it seems best to restrict our 
consideration to the two-population classification problem. We 
then have available for making the classification decision either a; 
likelihood ratio that is a numerical function of the observational i 
vector z, say, L (z), or we have a classification statistic defined j 
as a numerical function of z, say C(z). In either case a decision j 
rule is then simply the division of the L(z) or C(z) one- 
dimensional interval into two regions such that for those z^s that 
yield an L(z) or C(z) that falls in region two, the individual will 
be classified into population two. Thus we have reduced the prob- 
lem of classification to that of determining the region. 

Determining the Operational Effectiveness of the Classification 
Technique, 

Basic to the measurement of the operational effectiveness of any 
classification technique are the probabilities; 
p(i\j)==the probability of misclassifying an individual who be- 
longs in population ttj into population tti. 

From these probabilities one can evolve expected cost estimates as 
well as other criteria of worth. To obtain estimates of these proba- 
bilities one requires the conditional distribution function of the 
likelihood ratios or the classification statistic used in the technique. 
In some cases these distributions can be expressed either exactly 
or approximately in mathematical form and then the misclassifi- 
cation probability estimations simply require the evaluation of an 
integral over the required region. When such a mathematical 
representation is not available, an empirical approach can be used 
involving the individual observations available in the samples to 
produce an empirical estimation of the conditional distributions. 
Here it seems best to discuss the details of this step around an 
actual problem. 

For our problem let us assume that the admission office requires 
an admission policy such that the probability of a student^s doing 
unsuccessful work if admitted should be less than or equal to one 
tenth. 

1) The Control of Error Approach. 

What is needed is the distribution function of the statistic W(z) 
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since we would like to select X such that P[W(z) > X | z belongs 
to •7ri]=0.10. We know that W(z) is asymptotically normally dis- 
tributed imder the condition that z belongs to iTi with the mean, 

w.= l XM) . 

j=l i=i V 1 1/1 

and variance, 

v.= l Irtj V 

j=ii=i \ ) iA ) )/ 

For the sample data, we find upon substituting the appropriate 
sample characteristics into the formula for the means and variance 
that 

Wi=7.746 and 
Vw=3.676 

Thus we have to solve for X in the equation 

P(2[l)= 1 ^00 ^ 

V27r J (X — Wi) / Vv^ ® ““ zV*dz=0.10. 

From the table of areas under the normal curve we have 
1.282 =X-- 7.746 
VaeTO 

and 

X= 10.20 

and our classification decision rule can be stated as: 

"If W(z) = + 0.0350zi + 0.0448za + 0.12747z3 > 10.20 classify the 
observation as belonging to tts ” 

(That is, admit the student to the curriculum.) 

In a more general sense we can balance the two values of the 
two misclassification probabilities by selecting the appropriate 
value of X so as to meet any single constraint that might be im- 
posed. For example, one may wish to control the errors such that 
two probabilities are equal. It is evident that the solution of the 
resulting integral equation may require a numerical technique of 
some sort. 

2) The Cost Control Approach. 

Consider in our student admission example that we have avail- 
able the cost factors: 
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C(2 [ 1) =the cost of misclassifying an individual into population 
7 r 2 when he really belongs to tti (admitting a poor stu- 
dent )= 10 and 

C(1 1 2)=the cost of classifying an individual into population tti 
when he belongs to (failing to admit a good student) 
=20. 

qi=the a priori probability of a candidate for admission being 
from population 7ri=0.25. 

q 2 =the a priori probability of a candidate for admission being 
from population 712=0.75. 

Then if we wish an admission policy that would operate so as to 
minimize the expected loss, we have that 
U=qxp(2 I l,X)c(2 I 1) + q 2 p(l 1 2,\)c(l | 2) 
where Lx is the expected loss. In our particular case, 
Lx=(0.25)(10)p(2 I U) + (0.75)(20)p(l | 2,X) =2.5p(2 [ 1,X) 

+ 15.0p(l|2,X). 

So we seek a X which would minimize Lx. One can simply try dif- 
ferent values of X, determine the p (2 1 1, X) and p | 2,X) cor- 
responding to the X and then compute the Lx. Since the relation- 
ship between Lx and X is quite smooth, one can through such a 
trial procedure approximate the appropriate minimizing value of 
X within three or four steps. 

Determining the Effectiveness of the Above Classification Rule, 
In the case of the above two-populations— control of misclassifica- 
tion error situation— we compute the probabilities: 

P(2 1 1)=P (Admitting a student who subsequently does unsatis- 
factory work) 

=P (Classifying z into 712 when z belongs to tti), 

and 

P(1 1 2)=P (Failing to admit a student who could do successful 
work) 

=P (Classifying z into tti when z belongs to 712 ). 

Under step 3 we determined the classification rule (i.e., the X) 
such that p(2 1 1) =0.10. To determine p(l | 2) we have 
_ P P 

W2=X X oij 11.422, 

j=i i=i i i j j i 

and, due to the equal covariance assumption, 

Vw=3.676, 

so, 
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(10.20 - 11.422) / vsiere 

P(l|2)= 1 f e — z*/*dz=0m 

V2ir -^—00 

The rationale in these probability evaluations can best be ex- 
hibited graphically (Figure 1). 

Figube 1. Probability Evaluations 


P(W)|7rO, 


y (112)4 


iP(W[^0 




07.746 


11.422 


W(z) 


Thus we find that die operational effectiveness of the classifica- 
tion rule is such that P(2 1)=0.10 and P(1 2) =0.26. If one is 
disturbed over the size of P(1 1 2), he can either increase the al- 
lowable size of P(2 1 1) or he may seek additional or new variables 
that better discriminate between the two populations. 

Essentially, each of the classification techniques identified above 
follows the four main development steps that were enumerated in 
detail for the Wald Classification Statistic. Two additional prob- 
lems warrant special mention, however. 

The first is Ae so-called distribution problem. That is, the re^ 
quirement to have some knowledge as to how the statistic or likeli- 
hood ratio being used is distributed in probability under the con- 
dition that an individual comes from tTu. This knowledge is re- 
quired if one wants to formulate the particular classification rule 
to meet an error control or cost criterion. It is also needed if one 
is to estimate measures of operational effectiveness. We used the 
information that W(z) was normally distributed to generate 
these distribution requirements in the student admission illustra- 
tive example. One may, however, be interested in using a classifi- 
cation technique for which the mathematical form of its condi- 
tional probability distribution is unknown. In that case, especially 
if one has available a high speed digital computer and the sample 
sizes are sufiBciently large, one can resort to the use of an empiri- 
cally generated conditional distribution using the sample data. To 
illustrate the concept, let us suppose that we have available in 
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the student admission problem data on 190 individuals known toj 
be from population tti (unsuccessful). Then if the value of the; 
statistic, Wi(z) were computed for the 190 cases, these observa-; 
tions could be tabulated into a cumulated frequency distribution,: 
the distribution plotted and a smooth distribution fimction drawn! 
freehand to approximate the ogive of the underlying conditional ; 
probability distribution. From such graphical representation ap-i 
propriate values of P(2 1 1, k) and P(1 [ 2, X) could be determined | 
for corresponding values of X. In our error control classification | 
rule for the college admission problem we would have the fre- ; 
quency distribution and graphical representation as shovm in ! 
Table 1 and Figure 2. | 


Table 1. Frequency distribution for Wi(z) college entrance 
problem, population 7ri(z) 


Interval 

Tally 

Cum 

% 

4.50- 5.24 

9 

190 

100 

5.25- 5.99 

19 

181 

95 

6.00- 6.74 

24 

162 

85 

6.75- 7.49 

28 

138 

73 

7.50- 8.24 

21 

110 

58 

8.25- 8.99 

17 

89 

47 

9.00- 9.74 

24 

72 

38 

9.75-10.49 

12 

38 

20 

10.50-11.24 

18 

24 

13 

11.25-11.99 

2 

6 

03 

12.00-12.74 

4 

4 

02 


Figure 2. Empirical distribution of W(z) given irt 
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A comparable empirical estimate of the distribution of W(z) 
under the condition that the observation belongs to population 7r2 
could be evolved thrdugh the use of the observation available in 
the sample from TTa. The only variation in the technique would 
be in the accumulation of the frequencies. In this second case one 
would accumulate the frequencies with increasing Ws. 

Thus we would have an estimate of P(2 1 1, \) which yields the 
estimate of the probability of classifying an individual who is a 
m as a TTi if one used the decision rule “If W(z) > X classify the 
individual into 772.” 

The second problem that warrants additional mention is the 
multi-population problem. Here we are interested in classification 
procedures that could classify an individual into one of the several 
populations, where the number of populations is greater than two. 

If one can associate with each population, tt^, a qi, the a priori 
probability of obtaining for classification an observation from 
population tti, and a cost factor, C(j|i), associated with mis- 
classifying an observation from tti as being from ttj, then a de- 
cision rule is available that will minimize the expected cost of 
making classification. The rule states that: 

^f 


2 qip,(z)c(k|i) X qipi(z)c(jli) 

i 1, ^ j 

for all ](]V^k) then z should be classified into TTk” 

If the inequality for some indices along with k, then it is im- 
material as to whether the individual is classified into TTk or one 
of the populations whose index yields the equality. 

It should be noted that the practical use of these classification 
techniques will usually require the use of high speed computing 
facilities. This is especially true if the dimension of the problem 
is at all large or if one must empirically generate the conditional 
distribution of the statistic being used by utilizing the individual 
observations available in the samples. There are many unresolved 
problems associated with the use of many of these techniques, but 
it is felt that the systematic exploration of their applicability in 
many practical problems cannot help but advance the general 
state of the art. Although the discriminating power of the set of 
variables currently being accumulated can be determined, the 
characteristics of the underlying distributions and the relative 
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eflFectiveness of the competitive procedures must in many respects ; 
be tackled pragmatically. Attention must be given to the problem ; 
of estimating both the underlying a priori probabilities assoipiated j 
with the populations being considered along with the misclassifi- 1 
cation cost factors. Individuals may feel that such reflnemehts are | 
inappropriate to their particular classification problem, but it can ; 
be argued that until one addresses himself to the problem in some | 
such systematic and scientific way, no real improvement c^ be • 
expected. The criterion of worth of any system is its operational : 
eflFectiveness, and thus one should not only feel challenged to ! 
obtain estimates of the operational eflFectiveness of the ‘System” i 
he is now using, but he should also investigate how the eflFective- | 
ness may be improved by using one of the above statistical classi- j 
fication techniques. j 

I 
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Introductory Note 

The good Christian should beware of mathematicians ... 

St. Augustine 

O brace new world, that has such people in*t! 

Shakespeare 

Andrew Hacker has suggested that the use of electroma^etic 
computers to simulate the political behavior of the real worl3 has 
led to essentially trivial findings. A major fault, he notes, is that 
these enterprises are committee operations. "Computers . . . have 
no judgment. The sad thing is that those who are running the 
machines are themselves reluctant to exercise that quality which 
the computer lacks. One reason is that most such projects are 
team operations.”^ Hacker s example of this kind of futile activity 
is the attempt to simulate voter reaction in the 1960 presidential 
primary in Wisconsin. 

By coincidence the 1960 Wisconsin primary model is the subject 
of the first part of the article by Frank Scalora in this volume. This 
model has been criticized also on the ground that the procedure 
permits a sinister manipulation of voter mformation, and leads to 
thought control and to "brainwashing.” This view has been ex- 
pressed by some prominent American political leaders, and it is 
consistent with the point of view of those traditionalist political 
philosophers who emphasize ethical and normative approaches. 
Evaluation of the Wisconsin model based on this frame of refer- 
ence is obviously in sharp contrast with that of Hacker, for it 
suggests that the model is important and deserves careful and 
critical attention. 

The Scalora paper deals with problems in politics and adver- 
tising. The model is designed to aid a candidate by furnishing 
insights which can be used to persuade voters to support him, 
and it is designed to aid a firm in persuading university gradu- 
ates to go to work for it. The important question is whether the 
model achieves an eflBcient solution to these problems. If its value 
is negligible, it will not be because it is the product of a com- 
mittee and utilizes a computer. 

^"Mathematics and Political Science,’* in James C. Charlesworth (ed.), Matheinatics 
and the Social Sciences (Philadelphia: American Academy of Political and Social 
Sciences, 1963), pp. 65-69. 
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The Scalora model oflEers enhanced eflBciency to those who want 
to manipulate human behavior. Its successful employment is a 
legitimate subject of concern in normative terms, as well as proof 
that Hacker was in error to dismiss it lightly. The history of sci- 
ence is full of evidence that solutions to old problems often 
create new problems. This is an inevitable consequence of change. 
The progress of science and social change point toward com- 
plexity, not toward doomsday or the heavenly city. 

The article by Gerald Kramer is, like that of Scalora, designed 
to solve a problem of so-called practical politics. The aim of the 
model is the optimum allocation of scarce resources in a pre- 
election canvass. Output is measured in terms of votes won, or of 
votes gained. The political problem is, therefore, closely analogous 
to the resource allocation problems common to economics. Yet the 
fact that it is political is a source of specific difficulties. Votes 
won, or votes gained, are a difficult commodity to measure. The 
author, qualified in both mathematics and political science, has 
produced an imaginative model. If he solves the canvassing prob- 
lem, other resource allocation activities, such as television adver- 
tising, may become amenable to model solutions that are not pres- 
ently avadable. 

These models may be of little interest to the student of systems 
analysis or grand theory. Scalora and Kramer are concerned with 
relatively modest problems compared with war and peace, sur- 
vival in the nuclear age, the pursuit of justice, or the proper con- 
struction of nation-states. In a sense the aims of these papers are 
prediction and control, rather than understanding and extension 
of basic knowledge, and consequently the models may be less 
useful to political scholars than to the practitioners of the art of 
winning elections. On the other hand, as Havelock Ellis has ob- 
served: “In philosophy, it is not the attainment of the goal that 
matters, it is the things that are met with by the way.” If one 
appraises the work of political scientists in the light of tifie widely 
difiFering perceptions of what ought to be done,® it is evident that 
there is work aplenty for all. Some may devote themselves to great 
tasks, while others perform tasks which are immediately useful. 

* See, for example, Charles S. Hyneman, The Study of Politics: The Present State 
of American Political Science (Urbana: University of Illinois Press, 1959) and Albert 
Somit and Joseph Tanenhaus, American Political Science: A Profile of a Discipline (New 
York: Atherton Press, 19^4). 



Stochastic Models in the Behavorial 
Sciences: Applications to Elections 
and Advertising 

FRANK S, SCALORA 

IBM— World Trade Corporation 

The use of mathematics in the behavioral sciences has jb^e- 
fited from the availabihty of modem computing machines. It is 
now possible to take a behavioral situation, put it into a mathe- 
matical framework which simulates it, and then try it out on the 
machine. If good data are available the validity of the mathemati- 
cal model can be tested and then simulation runs made, increasing 
our understanding of the behavioral situation. It would be un- 
realistic to attempt to do this by hand because of the difficulties 
both of handling of data and of time availability. 

We shall discuss in this paper a common type of behavioral 
situation and the mathematical models developed to describe two 
applications. The situation is basically that of an election, althou^ 
it can be interpreted in many ways. 

In a pohtical campaign, each candidate must decide which 
issues to stress to help him win over his opponent. The advertising 
campaign for a product consists of telling the consumer about its 
especially good quahties to the detriment of competing products. 
The recruiting manager of a company has to find out what are 
the most effective aspects to stress in a campaign to get the best 
people for the job. 

All of these situations involve a population and competitors who 
are trying to influence the population by communicating with 
them. We will now describe the way these situations can be put 
into a useful mathematical framework. The problem would have 
remained intractable but for the availability of computing ma- 
chines to store and process huge quantities of data at high speeds 
and the uniting of sociological and mathematical ideas. 

The election model discussed here was developed by a research j 
group at IBM, of which the author was a member together with | 
Dr. William N. McPhee. Dr. McPhee has reported on the results : 
obtained when the model was used in the 1960 Wisconsin Pri- : 
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mary in McPhee: Formal Theories of Mass Behavior (London: 
Free Press of Glencoe, 1963). The college recruitment model was 
developed by the author of SBC with the consulting help of Dr. 
McPhee. Earlier reports on the work have appeared in the Pro- 
ceedings, Eighth Annual Convention, New York, Advertising Re- 
search Foundation, Fall, 1962 and Robert D. Buzzell; Mathe- 
matical Models and Marketing Management (Boston: Division of 
Research, Harvard Business School, Boston, 1964). 

In simulating a situation of the kind discussed here, we must 
isolate the main forces which are at work and describe them 
mathematically. Thus, we have a population which is being asked 
to make a choice among various competitors as the competitors 
address messages or communications to them. We think of the 
individual member of the population as a person with precon- 
ceived ideas about the competitors which are, however, under- 
going changes because of external forces generated by the cam- 
paigns of the competitors and environmental forces caused by the 
people with whom he associates. The model measures the change 
in a persons initial evaluations of the competitors in the course 
of the campaign. We will later discuss the results obtained when 
the model was apphed on the simulation of a recruiting campaign. 

The population is represented in the computer by a rephca, a 
representative sample of it. On the basis of answers to a question- 
naire, large quantities of information about the sample are stored 
in the machine. To keep the setting completely general, we will 
continue to use the terms population and competitors. The reader 
can easily translate these terms to electorate and candidates, in 
the case of an election campaign; consumers and brands of pro- 
ducts, in the case of a consumer product advertising campaign; 
students and companies, in the case of company recruiting cam- 
paigns involving college students. The reader should have httle 
difficulty in thinking of other situations close to his own experi- 
ence widch are describable in terms of the model. 

The model formalizes the above observations. It consists of 
three processes or phases which abstract the preceding remarks. 
We begin with a representative sample of the population. The 
sample is stratified into homogeneous sub-samples dependent upon 
the application in question. For example, an election grouping 
might be a socio-economic or religious-ethnic one. A consumer 
grouping might be socio-economic or some other grouping which 
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delineates among buying patterns. A college recruiting ^oupirig 
might be on lines of major field and career interest, etc. | 

It is supposed that at the beginning each person in the samplje 
has given an overall rating of each of the competitors, say on ja 
scale of one-to-ten, althou^ the scale may change from applica- 
tion to application. These ratings may be obtained directly from 
responses to a questionnaire as was done in the recruiting study 
previously mentioned, or indirectly throu^ a Lazarsfeld latent 
structure analysis as was done in the election model, or possibly 
in other ways. Each competitor will then try to increase the perl- 
son^s evaluation of itself relative to its competitors. In real life^ 
the competitors will try to accomplish this by advertising of stres- 
sing themes which they feel are particularly favorable to them. 
The model simulates this by what we call a stimulation process. | 
In the stimulation process, we pick a theme or issue for each 
competitor to stress. With Thurstone, we interpret a theme or 
stimulus as a probability distribution. Thus, a candidate's stateH 
ment on civil rights will affect Negroes somewhat differently from 
White Liberals and much differently from White Southern Con- 
servatives. A company s stressing of its pre-eminence in the com- 
puting business will cause different reactions among students in- 
terested in computers and students interested in pure mathemati- 
cal research. Thus, the effect of a stimulus will vary depending on 
the group to which the person belongs. In fact, for each group we 
have computed probability tables which relate overall ratings of 
the competitors with ratings on the given theme or issue. Then 
the response of a person to a stimulus is obtained through a raiidom 
process depending on the probability distribution peculiar to the 
personas group. We illustrate this by exhibiting a simplified "stimu- ; 
lus” table, which came up in our recruiting study. This table rep- 
resents the theme (Issue 8), “Encouragement of Ingenuity"" for 
Company A for a group of engineering students interested in 
physics. 


Response to Issue 8 for Company A 



Very High 

High 

go Very High 

.55 

.45 

1 High 

.33 

.59 

Moderate 

.20 

.60 

1 Low 

— 

.58 


Moderate Low Very Low 

.08 

.20 — 

.42 



114 


IVJATHEMATIGAL APPLICATIONS 


We obtained these numbers by the use of a mathematical 
formula on the basis of answers to specific questions in the ques- 
tionnaire. There are, however, other ways of doing this. The model 
interprets this table as follows: if a person in the group has given 
Company A a very high overall rating, then we can expect him to 
give a very high response to Company A*s use of Issue 8 (En- 
couragement of Ingenuity) 55 per cent of the time and a high 
response 45 per cent of the time. If he gives Company A a high 
overall rating, then he can be expected to respond very highly to 
the use of this issue 33 per cent of the time, highly 59 per cent 
of the time, and moderately 8 per cent of the time, etc. Thus, we 
allow the person to make a temporary response to the themes com- 
municated by the competitors by the use of these tables through 
a Monte Carlo technique. 

At this point, we assume that a person will want to check his 
responses. Here we borrow an idea due to K. Lewin. In real life, 
a person can check certain statements directly. For example, he 
can check the fact that a glass can be broken with a hammer by 
actually hitting a piece of glass with a hammer. In the situation 
which we are discussing, however, he will not be able to check so 
easily. For example, he cannot check objectively the statement that 
one company is better to work for than another without actually 
working for both, or that Governor Romney will make a good 
President, etc. The next best approach is for him to compare his 
feehngs with those of a friend. The model handles this by what 
we call "the discussion process.” We pick a friend for the person 
from his group. If the friend confirms his impressions, then we 
assume that the person's temporary responses to the stimuli to 
which he has been exposed are now permanent and are ready to 
become part of his experience. If the friend does not confirm his 
impressions, he will not necessarily accept his friend's impressions 
over his own, but will want to rethink the situation. The model 
approximates this by exposing him to the same stimuh as before. 
We now accept his emerging responses as permanent. 

Finally, we assume that the person will incorporate his surviving 
responses into his experience. In real life situations, the person is 
making new evaluations based on past preferences and new 
stimuli in a subjective fashion. The model accomplishes this by 
what we call a "learning process.'' The learning process consists 
of a formula which computes new overall evaluations of the com- 
petitors for the person involved. The formula "averages” the per- 
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son's initial overall evaluations with the surviving responses 
emerging from the stimulation and discussion processes, subject 
to the quantity of knowledge that the person thinks he has about 
the competitors. We have previously computed these knowledga- 
bility numbers for each person. One thing implicit in the formula 
is that a person who thinks he knows a lot about the competi- 
tors will be harder to change than one who thinks he knows less. 

We have actually described one stage, or cycle, of the model. 
We then store information about how the members of the sample 

Present 

Evaluation 


Stimulation Final 

Process Response 

Temporary 

Response 


Discussion 

Process 


Disagreement | 


Confirmation 




Final 

Learning 

Response 

Process 




New 

Evaluation 
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respond to the themes or issues which are likely to be used by the 
competitors in their campaigns. At the end of a cycle, the person 
may be exposed to new stimuli by the competitors. Now, how- 
ever, he will go into the stimulation process with new overall 
evaluations. A glance at the simplified stimulus table shown above 
illustrates a point which we now make. 

Subsequent communications by the competitors build on the 
changes in a persons attitude. Specifically, the learning process 
causes slow changes in the overall ratings in the specific direction 
toward the rating of the competitor on the specific issue being 
stressed. This improvement, or deterioration, is a double process 
because as his overall rating improves, say from high to very high, 
then he is more likely to get a very high response to the stimulus 
than he was before. Then, in turn, that will make him improve his 
overall rating still further. Alternatively, as his overall rating de- 
creases, say from very high to high, then his distribution of proba- 
bilities is worse. He will be less likely to get a "very high” response 
than before, and may get some "moderate” responses, also. This 
will make his overall rating deteriorate still further. 

We illustrate the above remarks by the flow chart of the model 
(see p. 115). It describes one cycle of the model. 

The Election Model 

We now give the essential details of the election model. We will 
not discuss any of its implications since these have been described 
in W. N. McPhee: Formal Theories of Mass Behavior (London; 
The Free Press of Glencoe, 1963). The references here are to the 
Wisconsin presidential primary in 1960. 

A. Input 

1. The Voters 

A sample of 1,783 Wisconsin voters was considered. The voters 
can be grouped in several ways. We give a possible grouping 


along religious 

-ethnic lines. 


Group Number 

Description 

Number of Voters 

1 

German Catholic Urban 

176 

2 

German Catholic Rural 

124 

3 

German Lutheran Urban 

169 

4 

German Lutheran Rural 

134 

5 

German Unafflliated Urban 

76 

6 

German UnaflSliated Rural 

65 

7 

Great Rntaln Protestant 

19 , 9 , 
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8 

. Great Britain UnafiBliated 

61 

9 

Scandinavian Urban 

77 

10 

Scandinavian Rural 

106 

11 

Polish Urban 

82 

12 

Polish Rural 

27 

13 

Irish Catholic 

58 

14 

Other Eastern European Urban 

78 

15 

Other Eastern European Rural 

54 

16 

Other Western European Urban 

42 

17 

Other Western European Rural 

34 

18 

Other 

298 

1,783 


In addition the voters are broken down by congressional dis- 
tricts, of which there are ten in all. 

Each voter is given a complex of numbers at the beginning: 
INT, Ph, Pk, Pr, SH, SK, SR, SN, C, Cm, G, PRD. Cefiniri^ of 
these numbers follow; 

INT : A number designating the strength of the voter's inter- 

est in his candidate of preference at the time in ques- 
tion (empty at beginning). 

Ph : The voter's partisanship number (overall rating) for 

the candidacy of Senator Humphrey. 

Pk : The voter's partisanship number (overall rating) for 

the candidacy of Senator Kennedy. 

Pr : The voter's partisanship number (overall rating) for 

the candidacy of Vice President Nixon. 

These three numbers were obtained originally through a process 
called “latent structure analysis." They are non-negative and are 
bounded by 05 and 95 except at the beginning. 

SH : A non-negative number reflecting the voter's cumu- 

lative involvement with the candidacy of Senator 
Humphrey. 

SK : A non-negative number reflecting the voter's cumu- 

lative involvement with the candidacy of Senator 
Kennedy. 

SR : A non-negative number reflecting the voter's cumu- 

lative involvement with the candidacy of Vice Presi- 
dent Nixon. 

SN : A non-negative number reflecting the voter's cumu- 

lative involvement with no party. 
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These four numbers are obtained from the partisanship num- 
bers and a number Mv for each voter which is determined ac- 
cording to the answer to certain poll questions. 

C : The voter’s choice of candidate at the time in ques- 

tion (empty at beginning), 

C = 1. Choice for Humphrey 

2. Choice for Kennedy 

3. Choice for Nixon 

4. Non-Voting Choice 

Ci-i : The voter s previous choice, 

G : The voters group number, e.g., 1 ^ G ^ 18 for the 

grouping which we have listed. 

FRD : The address in storage of another voter who is re- 

ferred to as the voter’ s friend with whom he presum- 
ably discusses the election. 

2. The Stimulus Tables 

The voters will be subjected to certain stimuli which are pri- 
marily the key election issues. Given a candidate and a group of 
voters, a stimulus table is developed for each stimulus. The table 
ranks the stimulus according to intensity: 20 (very weak), 40 
(weak), 60 (neutral), 80 (strong), 100 (very strong), and proba- 
bility intervals going from 00 to 99. 

The stimulus table is merely a convenient way of assigning one 
of these intensity (stimulus) numbers to a voter with a certain 
probability. The probability that a particular number is chosen 
is determined by the length of its corresponding probability in- 
terval. To illustrate, consider the following table: 


Intensity 

20 

40 

60 

80 

100 


Corresponding Probability Interval 

00-09 

10-24 

25-44 

45-69 

70-99 


According to what was just said a voter with the above 
table will be assigned intensity number 20 with probabihty .10 
(09 — 00 + 1); 40 with .15 (24 — 10 + 01); 60 with .20 (44 — 
25 + 01); 80 with .25 ( 69 — 45 + 01); and 100 with .30 ( 99 — 
70 + 01). 
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Two sets of simulus tables were constructed, three person tables | 
and two person tables. The use of the three person stimulus tables j 
presupposes that each candidate wages a full campaign. Since this j 
was not the case in Wisconsin, where Nixon did not campaign, we | 
were led to construct two person stimulus tables for Humphrey and j 
Kennedy and special tables for Nixon. The special tables for | 
Nixon r^ect one of three levels of campaigning— Normal Cam- j 
paigning, Moderate Campaigning, and No Campaigning. j 

B. Working of the Model i 

The input is now subjected to three processes. j 

1. The Stimulation Process ! 

Given a stimulus for each candidate, we subject a given voter j 
to it in the following way: Three random numbers RNh, JINk, | 
RNk (between 00 and 99) are chosen which determine the inter- | 
vals in the appropriate stimulus tables, and thus uniquely deter- I 
mine stimulus numbers STIMh, STIMk, STIMr ( =20, 40, 60, 80, I 
or 100). 1 

t 

The voter has already given to him partisanship numbers Ph, ! 
Pk, Pr. We compute: ! 

INTh = Ph + STIMh i 

INTr = Pk + STIMk i 

INTr = Pr + STIM„ 

If the two largest INTs are equal then we call the result a, tie 
and go throng the same process again until there are no ties, At 
this point, the largest INT is called INTma*. There is a number 
called INT„ta=100 with which INTm.* is compared. If INTm«c < | 

INTmto=100 then C=4, i.e., the voter is a non-voter (the machine 
puts 4 into the C position and INTma* into the INT position). 
Odierwise, C=I, 2, or 3 according as INTh, INTk, or INT® is 1 
TN T — ■ and INT.„a, becomes the new INT. i 

t 

i 

2. The Discussion Process j 

Each voter has assigned to him a friend randomly. A given voter j 

V comes into the discussion process with interest and choice num- j 

bers INTv, Cv from the stimulation stage. Similarly, the friend has 
numbers INTV, Cr. If INTv + INTf ^ MINM=200 (minimiim ! 
joint interest) then we say that discussion takes place. Otherwise, I 
no discussion takes place, and both voter and friend are unchanged 
and are returned to storage, the voter voting according to INTV 
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and Cv. When discussion takes place we compare Cv and Cf. If: 

(1) Cv=Cp then the voter and his friend are unchanged, 

(2) Cv=l or 2 and Cf= 3 or vice versa, then again the voter 
and his friend are unchanged. 

(3) Cv=l, 2, or 3 and Cf= 4 or vice versa then the non-voter is 
sent back through the stimulation process. 

(4) Cv=l and Cf=2 or vice versa then both voter and friend 
are sent back through the stimulation process, but only 
through Humphrey and Kennedy stimulus tables. 

When every voter has been subjected to the discussion process 
he has final choice and interest numbers, and final Cs are counted 
up to give the vote. It is possible to take the discussion process out 
of the model and proceed to the next stage, the learning process. 


3. The Learning Process 

Every voter enters this stage with the numbers INT, C, SH, SK, 
SR, SN with the first two having come from the preceding stage. 
If C=4, 100 — INT is added to SN to get the new SN and die 
other S*s remain unchanged. 

If C=l, 2, or 3 then INT — 100 is added respectively to SH, SK, 
or SR to get the new SH, SK, or SR and the other S’s remain un- 
changed. We then compute new partisanship numbers as follows: 


Ph = 
Pk = 

Pr = 


SH 

SH + SK + SR + SN 
SK 

SH + SK + SR + SN 
SR 

SH + SK + SR + SN 


These numbers are then rounded off so that they stay in the 
range between 05 and 95. Thus if Ph > 95 it is changed to 95 and 
if 0 < Ph < 05 it is changed to 05. These three numbers then be- 
come the new partisanship numbers for the voter and he is ready 
to be subjected to a new set of stimuli. 


C. Output 

The basic output consists of the following; 

1. The names of the candidates followed by the stimuli that 
they are using to stimulate the electorate at the point in question. 



121 


appucahons to HucracAi. potracs 

2. Election Intentions 

This is a tabulation of the voting after the stimulation process. 
It includes the names of the candidates together with their vote 
totals and the no-vote totals, the corresponding percentages, and 
the percentages based on the actual number of people who have 
voted. This is th^ followed by the voting subtotals by groups. 
Example: 

Election Intentions 



Total 

Percent 

Percent Voting 

Humphrey 

354 

20 

31 

Kennedy 

412 

23 

36 

Nixon 

392 

22 

34 

No Vote 

625 

35 



Total: 1,783 


Subtotals by Groups 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 
Humphrey 

13 10 36 35 24 18 35 15 15 29 12 2 2 12 10 11 6 69 

Kennedy 

69 58 20 20 9 15 6 9 18 9 32 11 36 28 13 7 5 47 
Nixon 

37 13 42 27 22 9 43 17 17 29 5 3 6 15 4 5 4 94 
No Vote 

57 43 71 52 21 23 38 20 27 39 33 11 14 23 27 19 19 88 

3. Election Remits 

This is a tabulation of the voting after the discussion process. Of 
course, if the discussion process is removed, then this appears after 
the stimulation process and replaces the Election Intentions. 

4. Election Remits by Congressional Districts 

This is a tabulation of the election results percentages by con- 
gressional districts. An example follows: 
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Congressional 

Percent 

Percent 

Percent 

Districts 

Humphrey 

Kennedy 

Nixon 

1 

31 

33 

36 

2 

29 

37 

34 

3 

33 

32 

35 

4 

28 

39 

33 

5 

29 

34 

37 

6 

29 

37 

34 

7 

32 

35 

33 

8 

29 

42 

29 

9 

35 

29 

36 

10 

33 

35 

32 


5. Election Results by Groups 

This is a tabulation of the election vote percentages by groups. 
Example: 


Group 

Percent 

Humphrey 

Percent 

Kennedy 

Percent 

Nixon 

1. German Catholic Urban 

11 

58 

31 

2. German Catholic Rural 

12 

72 

16 

3. German Lutheran Urban 

37 

20 

43 

4. German Lutheran Rural 

43 

24 

33 

5. German Unafflliated Urban 

44 

16 

40 

6. German UnaflBliated Rural 

43 

36 

21 

7. Great Britain Protestant 

42 

7 

51 

8. Great Britain UnafiBliated 

37 

22 

41 

9. Scandinavian Urban 

30 

36 

34 

10. Scandinavian Rural 

43 

13 

44 

11. Polish Urban 

24 

65 

11 

12. Polish Rural 

13 

69 

18 

13. Irish Catholic 

5 

82 

13 

14. Other Eastern European Urban 

22 

51 

27 

15. Other Eastern European Rural 

37 

48 

15 

16. Other Western European Urban 

48 

30 

22 

17. Other Western European Rural 

40 

33 

27 

18. Other 

33 

99 , 

45 
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The College Recruiting Model 

We now discuss the College Recruiting Model, some details | 
of which have appeared in R. D. Buzzell; Mathematical Models j 
and Marketing Management (Boston: Division of Research, Har- I 
vard Business School, 1964). 

The problem we considered was one in which four companies i 
were seeking the services of honor students majoring in engi- ! 
neering, mathematics, and physics. An earlier study had shown i 
that in evaluating a company as a place to work, such students | 
considered twelve issues to be particularly critical. These issues j 
are: I 

1. The companys standing in your major field of career interest. [ 

2. The caliber of its personnel. ; 

3. The opportunities it provides to do challenging work. 

4. The opportunities it provides for rapid advancement. 

5. The quality of its products or services. 

6. How hard the company drives to achieve its goals. 

7. Its special training program . . . formal courses offered, etc. I 

8. The encouragement it gives individuals to use their own in- | 
genuity in tackling pr(A>lems. 

9. The amount of basic research the company undertakes. i 

10. The extent to which the company is considerate of employees | 

while stilvtag for madmomproats. : 

11. Starting salary. 

12. The amount of financial aid and other assistance it gives to 

help employees obtain advanced degrees. ■ 

These twelve issues then became the stimuli to be used in the j 

stimulation process in the model, the themes which the companies I 
would be expected to use in their advertising. I 

A sample of honor students majoring in engineering mathe- | 
mattes and physics from five universities was obtained and polled | 
in January and again in M^, 1961, at the close of the school year.*^ i 
In addition, s^arate samples drawn from the same population 
were quizzed at intervals within this period to determine which 
communications were getting across. The samples were then I 
divided into mutually exclusive sub-samples according to career 
interest. i 


i 

* The selection and polling of the samples was done by Benton & Bowl^*** 
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A sample of about 250 students was considered. After rejection 
of some inadequate questionnaires, the students were grouped in 


the following way: 

Group No» of 

Number Description Students 

1 Engineering students interested in computers 39 

2 Engineering students interested in physics 49 

3 Engineering students interested in systems 49 

4 Engineering students interested in general 

engineering 23 

5 Mathematics, physics students interested in 

computers 23 

6 Mathematics, physics students interested in 

mathematics 22 

7 Mathematics, physics students interested in 

physics 29 


The students were asked to give an overall evaluation of each 
of the four companies as a place to work, and also an evaluation 
on each of the twelve issues. A scale of one to ten was used for 
each rating. On the basis of this information, we were able to 
compute the stimulus tables described below. 

Each student is represented by a vector of numbers. Here the 
subscript indicates the time stage of the game, and the superscript 
the company involved. 

In, In^^^ I<j^4; l<j<4; l^i<4; l^j^4; C; G; F. 

At the beginning, n=0 and L, are empty. 

: The initial partisanship number (overall rating) for 

the jth company, is the student's own rating of the jth 
company as a place to work. It is an integer between 
1 and 10, and was obtained from a questionnaire. 

: An integer between 1 and 4 indicating how fixed the 

student's attitude toward the jth company is. A high 
indicates the student will be more difficult to in- 
fluence than one with a low 
: Given by the formula Po''\ 

C : The choice number. An integer from 1 to 4 indicating 

after each cycle or stage, the company the student 
would be most likely to choose if he were forced to 
Tn''ke a choice then. 
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G : The group number. An integer from 1 to 7 indicating 

to which group the student belongs. | 

F : TTie address of a friend in storage. 

In addition, each student is labeled as to whether a given 
stimulus is not so important, somewhat important^ or im- 
portant to him. 

Given a company and a group of students, a stimulus table was 
developed for each stimulus. The stimulus table takes the follow- 
ing form: 


I 




ci 

r 

I 

I 

■s 

I 


pl040 

PiM 

pl0>8 

pio,r 

Pl0,3 

Pl0,5 

P«M 

pl0,3 

P“'» 

Pl0.t 

P9.10 

pM 

Pm 

p9,r 

p9,6 

P9,5 


p9,8 

P®>» 

P9>1 

ps^io 

P».» 

PS,8 

P8,r 

p8.6 

pM 

p..4 

P»«» 

p8,a 

p8,l 

Pr,io 

p7,9 

pr>8 

p7,T 

pr,e 

pr.s 

pr.4 

pM 

pM 

pM 

Pe,io 

P6.9 

p6>8 

P«,r 

p6,8 

p8,5 

pM 

p8,3 

P8,3 

pe,i 

Ps.lp 

P5,9 

P«s8 

P5>7 

p«,. 

P«,s 

Pm 

P5>8 

p5,3 

p5,i 

p440 

P4,0 . 

p4,8 


pM 

P4,5 

Pm 

Pm 

Pm 

p4,i 

p3,10 

P3.9 

p3,8 

p3,T 


P3,5 

Pm 

P»-» 

p3,2 

P34 

Pa,id 

P2,9 

P^B 

p2,r 

P2,8 

Pm 

Pm 

p2,S 

Pm 

p2,t 

pi,io 

pi,» 

Pl,8 

Pm 

Pl.8 

Pm 

Pm 

pM 

Pl,2 

Pw 


10 98765432 

lUting aC tbe Company on a Given Stimulus 


In the table, py is the percentage of those students in the given 
group who have rated the given company “i” overall as a place to 
work who have also rated it “j” on the given stimulus. Thus, 
10 

S py = 1 for every i. 

: 

We interpret py as the probabihty that a student in the given 
group will rate die given company "j” on the given stimulus if he 
has rated it Y* overil as a place to work. It is thus a conditional 
probability. Hence, if a student in the group has rated the given 
company “8” as a place to work, then he will rate the company 
“10'’ on the given stimulus with probability p8,io, “9” with proba- 
bility p 8 , 7 , “8” with probability p8.8, *T’ with probability etc. 
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B. The Mechanics of the Model 

The input is now subjected to the first stage of the model, which 
consists of the three phases or processes already discussed. 

1. The Stimulation Process 

Each of the foiu: companies concerned decides on a stimulus. 
Given a stimulus for each company, we subject a student to it in 
the following way. Four random numbers (between 0 and 1) are 
chosen, one for each company. 

The stimulus and company in question and the group to which 
the student belongs determine the stimulus table to be used. The 
student’s partisanship number determines which row of the 
stimulus table is applicable. Finally, the random number picks 
out a square in that row and a corresponding intensity or strength 
of stimulus (an integer from 1 to 10 obtained from the hori- 
zontal axis of the stimulus table). We illustrate by the following 
example. Let Po^^^==9, and suppose the ^*9” row t^es on the fol- 
lowing form: 

9 |2|4|2il|l|0|0|0|0|0 

10 987654321 

Then the random number is expected to fall in the: 

‘TO” square 2/lOs of the time 

“ 9” square 4/10 s of the time 

“ 8” square 2/10 s of the time 

“ 7” sqaure 1/10 of the time 

“ 6” square 1/10 of the time 

and the other squares with probability 0 

Suppose that the random number falls in the “8” square, then 
max 

Let Ii=l^j^4 li^\ In case there is no unique maximum, then 
we stimulate the student again with the same stimuli. If again 
there is no unique maximum, we use a “coin tossing” mechanism 
to produce a unique maximum. Once we have such a maximum, 
i.e., suppose Ii=I/^^ then we say that G= j. Thus we infer that 
if the student had to make a company choice as of the stimuli of 
the moment he would choose the jth company. We will ordi- 
narily stimulate a student only if the chosen stimulus is one that 
he considers somewhat important or very important to him. How- 
ever, we may stimulate all the students if we prefer. 
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I 

2. The Discussion Process 

Each student has assigned to him a friend chosen randomly 
within his group. The student has choice number, C(S) and the! 
friend, choice number C(F). The rules we set up for discussion! 
are as follows: | 

1. If C(S)=C(F), then we say that the student and frienc^ 
agree, there is no change in the student's numbers coming outi 
of the stimulation process, and he is ready to go into the third: 
process, the learning process. 

2. If C(S) ^ C(F), then we have disagreement, and the stu-l 
dent is asked to re-evaluate his position. He does this by being! 
restimulated with the same stimuli as before. Once restimulated, j 
we allow him to go into the learning process without fiuther dis- 
cussion. It is possible to take the discussion process out of the 
model, and proceed instead directly from stimulation to learning. 

3. The Learning Process 

At this point each student has “interest” numbers Ii^^\ 
li^\ m addition to his initial “partisanship” numbers 
“cumulative” numbers and “mass” numbers In the learn- 
ing process we compute new partisanship numbers Pi^^^ as follows: 

Then we define: 

' r^^ + l K^'^ + 1 “K^^^ + l 

In the case in which the student has not been stimultaed by the 
jth company, i.e., the case in which the stimulus chosen by the jth 
company is not deemed important by the student, we define 
p/i>=:Po^i^j or equivalently, set thus: 

P O- P 

p(j) ^ jTo ^ p (j) 

At this point, the student is ready to go through the second cycle 
or stage of stimulation, discussion and learning. Since Pi^^^ need 
not be an integer, let Pi^^^ be the integer nearest T?i^\ Then the 
appropriate row of the stimulus table to be used is determined 
by Pi^^^ and so on. In the case in which the student is not stipiu- 
lated, we take his interest numbers from the partisanship number 
of the previous stage, i.e., we let In^^^=Pn.i^^\ This is equivalent to 
making Pu^^^=Pn.i^^^ as can be verified easily. Then we proceed 
as above. 
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After n stages, die partisanship number is given by the 
formula: 

K«’ P.«’ + 2 

i=l 

K«’ + n K''» + n 


C. Otdput 

For each student we will be able to read his partisanship num- 
ber after each stage, e.g.: 

P CI) T3 (2) . T> <1) "D (2> . . P (1) P <2) 

0 jlTo -tl j-tl ,An 

Next, we will get, after each stage, each company’s average 
partisanship number; 


TT: 


(j> — 


234 

1 X P- '' 

234 ^=1 


where is the partisanship number of the mth student for tihe 
jth company after the nth stage and where 234 is the sample size. 

Finally, we will get the distribution for each company of the 
Pn^^^’s after each stage. For this purpose, we round oflE the Pn^^^’s 
to the nearest integer and express the number of them in each 
box by a percentage of the total. We also express these kinds of 
output by student group. 


D. Description of the Experiment 

The experiment consisted of two parts. In the first part, we set 
out to simulate in the model the actual advertising campaigns 
which took place during the period in question. This turned out 
to be rather diflBcult, for various reasons. For one, it was cus- 
tomary to use several of the critical issues in one advertisement. 
All of the companies were found to be using most of the critical 
issues in their advertising. Another difficulty was that it was ex- 
tremely hard, if not impossible, to estimate company recruitment 
budgets, media mixes, and the exact correspondence between 
frequency of advertising messages and number of cycles through 
which the message should be used in the model. 

Because of this, we devised the idea of an ^^average” issue or 
stimulus. The stimulus tables for the average issue were computed 
by averaging out the corresponding entries in the stimulus tables 
for all of the twelve issues. For example, the average stimulus 
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table for Company 1 and Group 1 was obtained by averaging out 
the corresponding entries in the Company 1-Group 1 stimulus 
tables for all of the twelve issues. We then used their average 
stimulus tables through ten cycles of the model and compared the 
results with the evaluations given in the second questionnaire. We 
give the results by group and by totals. 


Group 1 Company 1 

Company II 

Company III 

Company TV 

Questionnaire I 

8.54 

7.08 

8.67 

7.23 

Questionnaire II 

8.72 

7.49 

8.41 

7.31 

Model Prediction 

8.69 

7.59 

8.62 

7.64 

Group 2 Company 1 

Company II 

Company III 

Company TV 

Questionnaire I 

8.57 

7.14 

7.10 

8.04 

Questionnaire II 

8.94 

7.41 

7.51 

7.55 

Model Prediction 

8.94 

7.74 

8.25 

8.22 

Group 3 Company I 

Company II 

Company III 

Company TV 

Questionnaire I 

8.43 

7.78 

7.86 

7.82 

Questionnaire II 

8.53 

7.73 

7.92 

7.41 

Model Prediction 

8.80 

7.92 

8.45 

7.92 

Group 4 Company I 

Company II 

Company III 

Company TV 

Questionnaire I 

7.96 

7.57 

7.78 

7.74 

Questionnaire II 

8.83 

7.83 

8.30 

7.48 

Model Prediction 

8.73 

7.83 

8.54 

7.74 

Group 5 Company 1 

Company II 

Company III 

Company TV 

Questionnaire I 

8.43 

7.52 

8.39 

7.26 

Questionnaire II 

8.70 

7.65 

8.52 

7.39 

Model Prediction 

8.82 

7.68 

8.50 

7.61 

Group 6 Company 1 

Company II 

Company III 

Company TV 

Questionnaire I 

8.14 

7.23 

7.73 

7.18 

Questionnaire II 

8.59 

7.27 

8.18 

7.18 

Model Prediction 

8.82 

7.83 

8.43 

7.73 

Group 7 Company 1 

Company II 

Company III 

Company TV 

Questionnaire I 

8.76 

7.45 

7.14 

6.69 

Questionnaire II 

9:00 

7.52 

7.28 

7.21 

Model Prediction 

9.10 

7.86 

8.27 

7.20 

Total Company 1 

Company II 

Company III 

Company TV 

Questionnaire I 

8:44 

7.39 

7.78 

7.50 

Questionnaire II 

8.76 

7.56 

7.96 

7.88 

Model Prediction 

8.85 

7.78 

8.43 

7.78 
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As can be easily verified, the model prediction falls well within 
the most stringent K* goodness of fit criteria. We achieved even 
better results by running each of the twelve issues ten times and 
averaging the results, assuming that only students who thought 
the issue important would make a response to it. The results in this 
case follow. 


Group 1 Company I 

Company II 

Company III 

Company IV 

Questionnaire I 

8.54 

7.08 

8.67 

7.23 

Questionnaire II 

8.72 

7.49 

8.41 

7.31 

Model Prediction 

8.71 

7.43 

8.62 

7.57 

Group 2 Company I 

Company II 

Company III 

Company TV 

Questionnaire I 

8.57 

7.14 

7.10 

8.04 

Questionnaire II 

8.94 

7.41 

7.51 

7.55 

Model Prediction 

8.79 

7.48 

7.97 

8.15 

Group 3 Company 1 

Company II 

Company III 

Company IV 

Questionnaire I 

8.43 

7.78 

7.86 

7.82 

Questionnaire II 

8.53 

7.73 

7.92 

7.41 

Model Prediction 

8.68 

7.83 

8.20 

7.98 

Group 4 Company I 

Company II 

Company HI 

Company IV 

Questionnaire I 

7.96 

7.57 

7.78 

7.74 

Questionnaire II 

8.83 

7.83 

8.30 

7.48 

Model Prediction 

8.42 

7.79 

8.23 

7.84 

Group 5 Company I 

Company 11 

Company III 

Company IV 

Questionnaire I 

8.43 

7.52 

8.39 

7.26 

Questionnaire II 

8.70 

7.65 

8.52 

7.39 

Model Prediction 

8.68 

7.67 

8.50 

7.65 

Group 6 Company 1 

Company II 

Company III 

Company IV 

Questionnaire I 

8.14 

7.23 

7.73 

7.18 

Questionnaire II 

8.59 

7.27 

8.18 

7.18 

Model Prediction 

8.69 

7.55 

8.24 

7.62 

Group 7 Company I 

Company 11 

Company III 

Company IV 

Questionnaire I 

8.76 

7.45 

7.14 

6.69 

Questionnaire II 

9:00 

7.52 

7.28 

7.21 

Model Prediction 

8.98 

7.62 

7.92 

7.31 

Total Company I 

Company II 

Company HI 

Company IV 

Questionnaire I 

8:44 

7.39 

7.78 

7.50 

Questionnaire II 

8.76 

7.56 

7.96 

7.38 

Model Prediction 

8.72 

7.62 

8.23 

7.79 
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In the second part of the experiment, we took each issue and 
ran it through die model ten times for each company. Thus, we 
started with Issue 1, “the company’s standing in your field of 
career interest,” and we let each of the companies use this issue 
through ten cycles of the model. Then we started anew with Issue 
2 and did the same thing. And so on, throu^ Issue 12. 

We did this part of the experiment in two ways. In the first 
way, we let each of the students be stimulated by the ^ven 
issues. In the second way, we stimulated a student widi an issue 
if he felt the issue was important to him. We will summarize some 
of the results. 

First of all, we defined an issue to be a good one for a given 
company if through the use of it the company’s average overall 
rating went up more than 5 per cent. We defined the issue as 
being a better one for a company than for another company if 
its percentage rise was greater than that of the company. 

Thus, in comparing Company 3 with Company 1, its main com- 
petitor, we found the following breakdown useful. 

Company 3 Better Company 3 Not as 

Than Company 1 Good as Company 1 


Company 

Issues 


Individual 

Issues 


Business Image Scientific Image 

Thus, Company 3’s good issues are “quality of products, stand- 
ing in field of career interest, drive to achieve goals, challen^g 
work, caliber of personnel, basic research,” which are basically 
issues relating to how good a company it is. Its poor issues are 


Good 

Issues 


Poor 

Issues 


Quality of products 
Standing in field of 
career interest 

Challenging work 
Caliber of personnel 

Drive to achieve goals 

Basic research 

Starting Salary 

Encouragement of 
Ingenuity 

Training Program 

Aid to Education 

Consideration of 
Employees 


Rapid Advancement 
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"starting salary, training program, consideration of employees, 
rapid advancement, encouragement of ingenuity, aid to education” 
which are issues relating to how good a company it is for the indi- 
vidual. 

Also, the issues in which Company 3 is better than Company 1 
are "quality of products, standing in field of career interest, drive 
to achieve goals, starting salary, training programs, consideration 
of employees, and rapid advancement.” which are issues having to 
do with its business image. But the issues in which Company 3 
does not do as well as Company 1 are "challenging work, caliber 
of personnel, basic research, encouragement of ingenuity, aid to 
education” whcih are issues having to do with its scientific image. 

In its advertising against Company 1, Company 3 can best afford 
to stress the issues in the upper left-hand box: “quality of product, 
standing in field of career interest, drive to achieve goals.” The 
issues in the upper right-hand box, "challenging work, caliber of 
personnel, basic research” are also good issues for Company 3. 
However, they represent a **trap” for it, since Company 1 is stronger 
than they are in those issues. In fact, the basic research issues 
have so been pre-empted by Company 1 that Company 3’s use of 
them calls attention to Company I’s excellence in them. 

The issues in the lower left-hand box; “starting salary, training 
program, consideration of employees, rapid advancement,” are not 
particularly good for Company 3 but Company 1 is no better, so 
that stressing these issues causes small gains for Company 3 as 
opposed to Company 1. Finally, the issues in the lower right-hand 
box, "encouragement of ingenuity and aid to education,” are bad 
ones for Company 3 in which Company 1 is better. These issues 
represent a product problem for Company 3 in relation to Com- 
pany 1. They are issues in which Company 3 will have to make 
basic changes before using them. 

If we compare Company 3 with all of its 3 competitors, we get 
the following (see table, p. 133): 

The numbers in parentheses represent the companies that are 
better than Company 3, 

Here it is seen that there are three issues which are good ones 
for Company 3 in which it does better than all of its competitors, 
namely: "quality of products, standing in field of career interest, 
drive to achieve goals.” Those are obviously Company 3’s best 
opportunities for improving its standing against all its competi- 
tors. 
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Company 3 Better 
Than Competitors 

Company 3 Not as Good 
as Competitors 


Quality of Products 

Challenging work (1) 

Good 

Issues 

Standing in field of 

Career Interest 

Caliber of Personnel (1) 


Drive to Achieve Goals 

Basic Research (1) 


Starting Salary 

Encouragement of 

Ingenuity (1, 4) 

Poor 

Issues 


Aid to Education (1, 2, 4) 

Training Program (2) 



Consideration of 

Employees (2, 4) 



Rapid Advancement (2) 


The issues in the upper li^t-hand box, “challenging work, cali- 
ber of personnel, basic research,” are good (mes for Company 3, 
but it loses ground to Company 1, when both are using them. 
However, it does gain on Companies 2 and 4. Company 1 seems to 
have pre-empted those issues. 

In the lower left-hand box, there is the issue, “starting salary,” 
which is not a particularly good one for Company 3; however, it 
does gain on all of its competitors through the use of it. This repre- 
sents an area for possible education of the population. 

Finally, in the lower right-hand box are issues which are not so 
good for Company 3, and in which some competitor does better. 
In fact, the issue, “encouragement of ingenuity,” is one in winch 
both Company 1 and Company 4 do better. All three competitors 
do better on “aid to education.” Companies 2 and 4 do better on 
“consideration of employees,” Company 2 does better on “training 
program” and “rapid advancement.” 

Let us see how the issues divide up for the other three compa- 
nies. 
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Company 1 

Company 1 Better Some Competitor Better 

Than Competitors Than Company 1 


Good 

Issues 


Poor 

Issues 


Challenging Work 

Basic Research 

Encouragmeent of 
Ingenuity 

Caliber of Personnel 

Quality of Products (3,4) 

1 

Aids to Education (2, 4) 

Drive to Achieve Goals (2,3,4) 
Training Programs (2, 3, 4) 

Standing in Field of Career 
Interest (2,3,4) 

Starting Salary (2,3,4) 

Consideration of Employees 
(2.3,4) 

Rapid Advancement (2, 3, 4) 


This shows that Company Vs good issues are "challenging work, 
basic research, encouragement of ingenuity, calibre of personnel, 
and quality of products.'" Of those, the first four are issues in which 
it does better than all of its competitors. In the last issues, "quality 
of products," there are two companies. Companies 3 and 4, which 
do better. Its good issues tend to be relating to how good a scien- 
tific company it is. 

Company I's poor issues are "aid to education, drive to achieve 
goals, training program, standing in field of career interest, start- 
ing salary, consideration of employees, and rapid advancement." 
Oddly enough, Company 1 does not top any of its competitors in 
any of these issues. In fact, it is topped by all of its competitors in 
all of the issues except for "aid to education" where Company 2 
and 4 do better. Company I's poor issues tend to be issues having 
to do with its business characteristics and its dealings with the in- 
dividual. 
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Company 2 


Company 2 Better Some Competitor Better 

Than Competitors Than Company 2 



Aid to Education 

Quality of Products (1,3,4) 

Good 

Training Programs 


Issues 

Rapid Advancement 

Drive to Achieve Goals (3,4) 

Caliber of Personnel ( 1, 3 ) 



Encouragement of Ingenuity 
(1.3.4) 

Poor 


Basic Research (1,3) 

Issues 


Challenging Work ( 1, 3, 4) 



Standing in Field of Career 
Interest (3,4) 



Starting Salary (3,4) 

Consideration of Employees 
(3,4) 




Company 2^s good issues are “aid to education, training pro-i 
gram, and quality of products/^ Of these, it is better than all of itsj 
competitors in the first two. In the third one, “quality of products,” ! 
it is excelled by all three of its competitors. I 

Company 2's poor issues are “rapid advancement, drive to- 
achieve goals, caliber of personnel, encouragement of ingenuity,! 
basic research, challenging work, standing in field of career inter-j 
est, starting salary, and consideration of employees.” However, the 
first of these, “rapid advancement,” is one in which it does better 
than its competitors. This gives it a possible issue to use in an edu- 
cational way. In the remaining issues, there are at least two coin- 
petitors that do better in each. 

Company 2’s best issues seem to deal with the help it gives th0 
individual to improve himself. ! 
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Finally, let us look at Company 4. 

Company 4 

Company 4 Better Some Competitor Better 

Than Competitors Than Company 4 




Quality of Products (3) 

Good 

Issues 


Encouragement of Ingenuity 
(1) 



Drive to Achieve Goals (3) 



Standing in Field of Career 
Interest (3) 



Aid to Education (2) 


Consideration of 

Challenging Work (1,3) 


Employees 

Caliber of Personnel ( 1, 2, 3) 

Poor 

Issues 


Basic Research (1,2,3) 

Starting Salary (3) 

Training Programs (2,3) 

Rapid Advancement (2,3) 


Company 4"s good issues are “quality of products, encourage- 
ment of ingenuity, drive to achieve goals, standing in field of 
career interest and aid to education.” However, in each of these 
there is a competitor which does better. 

Its poor issues are “consideration of employees, challenging 
work, caliber of personnel, basic research, starting salary, training 
programs, and rapid advancement.” Of these. Company 4 does 
better than all of its competitors in the issue, “consideration of 
employees,” and worse than some competitor in the others. 

The reader— will note that the issue, “quality of products” is a 
good one for aU four companies, while the issues, “rapid advance- 
ment and starting salary” are poor ones for all of the companies. 




A Decision-Theoretic Analysis of a 
Problem in Political Campaigning 

GERALD H. KRAMER 

University of Rochester 

1.1 In the past two decades, the use of quantitative methods as 
aids for decision-making has become common in many fields, par- 
ticularly those involving industrial and military operations. More 
recently, efforts have been made to apply these methods to other 
governmental activities.^ By and large, however, these efforts Jiave 
not been made by political scientists, nor have the methods em- 
ployed, despite their increasing sophistication and power, had 
great impact upon the discipline. This is unfortunate, for many of 
the traditional concerns of political scientists appear to be quite 
susceptible to this sort of analysis. In this paper, we will attempt 
to show how such a quantitative decision-theoretic approach 
might be used to analyze a practical political problem, namely the 
problem of conducting a door-to-door canvass of voters, for parti- 
san campaign purposes. 

Such a demonstration may be of interest for two reasons. First, 
it may lead to results which are of substantive or practical interest 
to the student of political campaigning. In the course of our an- 
alysis we will suggest some rough rules of thumb and then develop 
a systematic optimization procedure for efficient canvassing; we 
will also offer sonie tentative conclusions concerning the relative 
efficiencies of several simpler canvassing strategies, and indicate 
the relevance of our findings to other campaign problems. 

The demonstration may also be of broader methodological in- 
terest. In political science there has been considerable debate and 
discussion as to whether certain concepts can be quantified, or 
certain problems studied quantitatively. In fact, there is no reason 
to doubt that quantitative research— of some kind— be donb on 
almost any problem; the only interesting questions are whether it 

^For examples, see R. N. McKean, Efficiency in Government Through Systems 
Analysis (New York: Wiley, 1958); H. G. Sckaller, ed., Public Expenditure Decisions 
in the Urban Community (Ezhimotei Johns Hopkins Press, 1963); and chaps, vu 
and xiii of D. B. Hertz and R. T, Eddison, eds.. Progress in Operations Kesearch, Vol. 
II (New York: Wiley, 1964). 
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should be done, and how. But these questions— as R. L, Ackoff, for 
example, convincingly argues*— cannot be satisfactorily understood 
except by examining them in the context of the uses to which the 
research results are ultimately to be put. We will not be specific- 
ally concerned with methodological questions here. Nevertheless, 
by focusing exphcitly on the question of uses, and by showing one 
way in which quantitative empirical results can be applied to 
solve a specific problem, we may at least be able to indicate a 
perspective, by means of which some of the methodological issues 
of quantitative empirical research may come to be better under- 
stood. 

1.2 The organization of this study is as follows; in section 2 we 
formulate the overall problem of resource allocation in pohtical 
campaigning, within a general decision-theoretic framework. We 
then narrow the focus to canvassing, and in section 3 develop a 
simple quantitative model of a political canvass. In section 4 we 
describe a general technique, based upon the model, which sys- 
tematically discovers the optimal allocation of canvassing effort, in 
any constituency and for any budget size. We also describe some 
simpler canvassing strategies, and then demonstrate and compare 
all of these approaches by applying each to a hypothetical con- 
stituency. This analysis is based upon a number of simphfying 
assumptions; in section 5 we explore the question of how our con- 
clusions are affected when these simplifying assumptions are re- 
laxed, in various ways. Finally, some brief concluding remarks are 
offered in section 6. 

2.1 In general terms, we can describe a decision problem as fol- 
lows: we have a decision-maker, who is confronted with a set of 
alternative, mutually exclusive courses of action, and who is inter- 
ested in attaining certain possibly conflicting goals or objectives, 
j The available alternatives are related to the objectives, perhaps in 
complex and uncertain ways; the decision-makers problem is to 
select that alternative which is “best” in terms of his goals. Quanti- 
tative analysis of such a problem requires that we provide a con- 
cise description of the problem, a precise criterion of *T)est,” and 
finally, a systematic way to use this information to discover which 
alternative is in fact best. A comprehensive solution to the overall 

L« Ackoff, S*Ki Gupta, and J. Si Minas, Scientific Method: Optimizing Applied 
Research Decisions (New York; Wiley, 1962). 



139! 


APPUCATiONS TO PRACnCAL POLITICS 

problem of conducting a political campaign is hardly feasible at! 
present. However, as a first step toward that ultimate goal, and I 
also as background for the more detailed treatment of canvassing I 
in sections 3 to 5, let us briefly attempt a preliminary formulation i 
of the overall problem. j 

2.2 The range of alternatives confronting a candidate running 
for office is truly enormous. Among the subjects dealt with in one 
well-known campaign manual, for example, are the following: ! 
registration drives, mail campaigns, house-to-house canva$sing, | 
bumper sticker campaigns, special group activities, coffee parties, i 
larger receptions, plant visits, sound trucks, meetings and debates, I 
television, telephone campaigns, voter transportation, and poll ' 
watching.® Each such activity can be carried out in a variety of j 
ways, and the purpose of a campaign manual is presumably to de- j 
scribe some of the more efficient ways. ! 

In addition to these various tactical questions, there is alsp the ! 
broader strategic question of deciding between activities. If our | 
resources are limited, then to increase the scale of one activity | 

( e.g., to make more use of TV) means we must cut back on siome 
other activity (e.g., plan a smaller canvass); somehow, we must 
decide which activities to increase and which to cut back, in order 
to achieve a balanced overall campaign strategy. Let us suppose 
that there are n distinct activities, and that for each we have a 
quantitative measure of the overall level of the activity— e.g., so 
many nian-hours of canvassing, or hours of TV, etc. Then we can j 
Concisely represent any campaign strategy by its activity levels I 
Xi, ..., Xn. The set of all possible strategies is the set of all ^uch | 
n-tuples, and the set of available strategies is the subset of such I 
nrtuples which are feasible in terms of the resources available, j 
Thus, if the only resource which is limited is money, and if the i** 
activity costs Ci dollars per unit, i=l, 2,..., n, then ihe set of avail- j 
able strategies is set of n-tuples which satisfy 

■ n I 

saXi^B, i 

i=i I 

where B is the maximum possible campaign budget, in dollm** ; 

The campaign problem is to determine which of these n-tuples is 
‘best,” according to some well-defined criterion of ‘best.” 

^The 'Democratic Campaign Manual 1964 (Washington, D, C.: Democratic National 
Committee, c. 1964), 
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2.3 Just as there are many ways of running a campaign, so also 
there is a variety of possible goals which a candidate may be pur- 
suing. No doubt most candidates are interested in winning the 
election. Even so, a third-party candidate, for example, may have 
no real hope of winning, and may therefore gear his campaign 
strategy to other goals, such as getting his ‘ message'* across, or 
depriving one of the major-party candidates of votes. Even a 
serious contender for oiBSce may place great stress upon factors 
other than success, such as “educating'* his constituents, whatever 
the electoral consequences. But however important such goals 
may be in specific instances, if a general analysis is to proceed we 
must concentrate upon the major and most tangible of the goals. 
For most candidates in most contests this goal is clearly to win. 

Political campaigning is an uncertain business, in which no cam- 
paign strategy can guarantee victory. Thus one plausible quanti- 
tative translation of the goal of winning, which takes this uncer- 
tainty into account, is that the candidate wishes, in selecting his 
campaign strategy, to maximize his probability of winning. An 
alternative, thou^ related, formulation is that the candidate 
wishes to maximize the size of his plurality (or more precisely, 
since uncertainty is present, his expected plurality E (ua — Ub) 
where Ua and Ub are the votes cast for A and B, respectively, and 
where E is the expected-value operator.**) 

With the usual electoral arrangements, winning is normaHy 
closely related to the size of the candidate's plurality. However, 
these two formulations of the candidate's goal, though related, may 
nevertheless lead to differing recommendations when used to 
assess the value of alternative campaign tactics. This is particularly 
likely if one of the tactics is very risky, but also potentially very 
productive. For example, suppose the choice facing the candidate 
is between adopting such a tactic (ti), versus continuing present 
tactics (t 2 ); moreover, suppose the candidate now has 55% of the 
votes and will maintain this lead for sure with t 2 . Tactic ti, on the 
other hand, will either gain another 40% or lose 10% of the vote, each 
with probability .5. If his goal is to maximize his expected plurality, 
the proper choice is to adopt ti, since his expected plurality is 40% 

** The expected value of a function is its average, defined by 
E(f) = 2 xPr(f=x). 

X 

See, e.g,, J. G. Kemeny, et» al*, Finite Maihefnaiical Structures (Englewood Cliffs, N. J-: 
Prentice-Hall, 1^61), 
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then versus 10% with t 2 . On the other hand, choosing ti ever U re- 
duces the probability of winning from 1.0 to .5, and therefore if the 
goal is to maximize his probability of winning, exactly the opposite 
recommendation is in order. Other things being equal, presumably 
most candidates would take the more conservative course and 
adopt t 2 ; in that sense, the probabilistic objective is the more 
realistic. 

However, this formulation is computationally quite diflScult to 
work with; in practical applications one would have to resort to 
simulation techniques which are expensive and often cumbersome. 
The expected-plurality criterion is much simpler in this respect, 
and possesses the convenient property that if we can evaluate the 
candidate’s expected plurality in each of several subunits (e.g., 
precincts) in his constituency, then his overall plurality can be 
obtained by simple summation. Clearly this is not true of the prob- 
abilistic criterion. Moreover, the expected-plurality criterion is 
more easily comprehended and communicated, since campaigners 
traditionally think in terms of so many votes gained or lost, and 
the criterion translates directly into these terms. Either formula- 
tion provides us with a reasonable, quantitative value criterion; 
however, in subsequent discussion we shall employ the expected- 
plurality criterion. 

2.4 Suppose the following: that we have settled on a value cri- 
terion V; that, after extensive empirical analysis, we are able to 
predict what level of V will result from implementing any partic- 
ular campaign strategy Xi, X 2 , ..., X„; that activity i, i=l, 2, ..., n, 
costs Ci dollars per imit (man-hour, TV-minute, or whatnot); and 
that the total cost of whatever strategy we adopt shall not exceed 
our budget B. The candidate wishes to find the best feasible strat- 
egy; thus the overall campaign problem is to find Xi, ..., Xn such 
that 

V (Xi, ..., Xn) — maximum, 

subject to (2.1) 

SCiXi<B. 

Under certain reasonably general assumptions about the function 
V (Xi, X 2 , ...5 Xn),® the following will be true: at the maximum, the 

® Specifically we assume V to be continuous, increasing, and concave, and every 
activity to be sufficiently productive so that the problem of corner maxima does not 
arise. See, e.g., the Appendix, “The Simple Mathematics of Maximization,” in C. J. 
Hitch and R. N. McKean, The Economics of Defense in the Nuclear Age (Cambridge: 
Harvard University Press, 1961), 
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marginal increase in V produced by spending an additional dollar 
on any activity must equal that produced in any other activity. 
Conversely, if the marginal increase in some activity i is less than 
that in another activity j, then clearly we can obtain a better 
strategy by reallocating funds from i to j (unless the allocation to 
i is already zero dollars, in which case further reallocation is im- 
possible). These marginal increases, or marginal productivities, 
play an important role in discovering and verifying a solution to 
(2.1). Hence in studying campaigning, or any of the various cam- 
paign activities, one important aim of the analysis is to provide a 
basis for calculating these marginal productivities. 

3.1 To demonstrate how such an analysis might proceed, we will 
examine one of these activities in greater detail. The problem 
which we consider is that of conducting a precinct-level door-to- 
door canvass of voters during a campaign, in order to pass out 
literature, reinforce the faithful and convert the opposition, and so 
on. In conducting a drive of this sort there are a number of choices 
to be faced, concerning which areas of the constituency shall be 
canvassed, what type of literature and of approach shall be em- 
ployed, which routes shall be assigned to which workers, and so 
forth. Here, we consider only the two broadest problems, concern- 
ing the choice of localities and of "tactics,"' a term to be defined 
below. 

Conducting a canvass requires the expenditure of various kinds 
of resources, such as labor, printed materials, etc. We assume that 
it is always possible to obtain additional quantities of any of these 
resources, at fixed costs, if necessary; hence the only resource 
limitation we need consider is the overall budget constraint. A 
canvassing budget of given size can be employed in a variety of 
ways, producing a variety of different effects. Our problem here is 
to determine— or more accurately, to obtain a method for deter- 
mining— which of these possible ways is "best," in terms of the 
expected plurality produced. As a first step in this endeavor, we 
proceed now to construct a model of a political canvass, with 
which we can assess the effects of alternative canvassing strategies. 

3.2 By a model of a canvass, we mean a symbolic representation 
of the process, which can be manipulated for predictive purposes. 
The elements of our model are the following: we assume the 
electorate to be portioned into a number k of small, relatively 
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homogeneous units such as precincts or voting districts. In the 
precinct, let | 

N‘ be the number of residents, i 

Pb^ be the fraction of registered voters, of whom 

IV actually vote, and 

IV and Pb‘ prefer parties A and B respectively. | 

Where we are speaking of a sin^e precinct and no ambiguity i 
will result, we will usually omit the superscripts. Notice that, as of j 
the time of the canvass, the quantities Pa>Pb,Pv are predicted ; 
rather than actual values; they are forecasts of what will happen j 
several days or weeks hence, on election day. These prediptions | 
need not be extremely accurate; extrapolation from past compar- 1 
able elections would suffice. We assume that it is possible to make | 
these predictions about each precinct, at negligible cost. We also j 
assume that it is possible, thou^ not necessarily inexpensive, to | 
determine for individual voters within the district whether they j 
are registered, and which party they prefer. In a well-organized 
precinct, this is the type of information which would be cont^ned { 
in the party’s card file; in an unorganized precinct, it might be 
possible to use official registration data (where partisan registra- i 
tion is in effect), or it might be necessary to conduct a pr^am- { 
paign canvass. Again, this information on individual voters need 
not be perfectly accurate, though for simplicity we will assume, 
initially, that it is. We also assume that in any single homogeripous 
precinct, turnout ^d partisanship are statistically independent. 

Next, we assume that within any single precinct there are two 
basic tactics available to the party conducting the canvass. Iij the 
first, which we will refer to as a ‘hHnd” canvass, the party sys- 
tematically contacts every person in the precinct, irrespective of ; 
registration and partisanship. The second tactic is a ‘‘selective” | 
canvass, in which only registered partisans of the party are con- ! 
tacted. Clearly there are other possibilities as well, such as con- | 
tacting all registered voters or attempting to contact only the | 
habitual nonvoters within the precinct. Here, however, we will . 
consider only the two representative tactics described above. | 

Our model must finally take into account the response of the i 
individual voters in the constituency to contact by a party worker. 
Many different kinds of effect are possible, e.g., upon the voters’ ' 
motivations and attitudes, his knowledge, or his subsequent be- 
havior. For our purposes, however, only those responses which 
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aJffect oxir candidate’s plurality in the forthcoming election are 
immediately relevant; hence, we ignore the various possible psy- 
chological effects and confine ourselves to the question of how 
partisan contact affects the recipient’s subsequent voting behavior. 
It is useful to distinguish between two possible types of voting- 
behavior effect, which we shall refer to as preference and turnout 
effects. By a preference effect, we mean any alteration in a voter’s 
candidate preference— or, more precisely, in the probability that, 
if he votes, he will cast his ballot for a given candidate. By a 
turnout effect, we mean any alteration in the probability that he 
will vote at all, for either candidate. 

Obviously, the questions of the existence and of the magnitudes 
of these effects are empirical questions, and can only be settled by 
empirical investigation. In fact, several such investigations have 
been performed by various researchers. It would be too much of a 
diversion, here, to review the methods and results of each of these 
studies; in summary, however, they seem to show the following: 
Preference effects, in contested partisan elections, are small and 
statistically insignificant in magnitude, and do not follow any con- 
sistent pattern in direction. Henceforth we will ignore them. Siz- 
able turnout effects, however, do apparently exist. These effects 
are positive, in the sense of increasing (rather than decreasing) 
turnout probabilities, and for practical purposes their magnitudes 
can be taken to be independent of such factors as the partisanship 
of the contact or of the recipient, or the level of the oflBce being 
contested.® For our purposes, of course, we need a precise and 
quantitative description of these effects. The following simple 
model is convenient to work with and has proven to be a realistic 
formulation empmcally: 

Pr(V|C)=^Pr(V|C)+«[l~-Pr(V|C)] (3.1) 

Here, Pr(V|C) is the probability of voting in the absence of con- 
tact, Pr(V|C) is the probability of voting after having been con- 
tacted, and a is a paramenter. In terms of relative frequencies, the 
model asserts that if a large group of voters is canvassed, then the 
final turnout rate wiH equal the precontact rate Pr(V|C), plus a 
certain fraction a of that portion of the group which would not 
otherwise have voted. That is, the rate (or probability) of non- 


*For a summary of most of the available evidence, see G. H. Kramer, ”Decision- 
Thcoretic Analysis of Canvassing and Other Precinct-Level Activities in Political 
Campaigning’* (Doctoral dissertation; MIT, 196f), chaps, iii and iv. 
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voting is reduced by a constant fraction a. Evidently the pre- 
contact turnout probabilities (and therefore also the postcontact 
probabilities) can vary from voter to voter, or precinct to precinct; 
however, a is constant for all voters and for all precincts. Ernpiri- 
cally, a typical or average value of a is .4.^ (Clearly we are 
speaking here only of registered voters, since both the pre- and 
postcontact turnout probabilities of unregistered voters are always 
zero. ) 

3.3 The value criterion which we wish to maximize is the candi- 
date’s constituency-wide expected plurality. This overall plurUity 
equals the sum of the candidate’s sub-pluralities in each precinct; 
hence let us initially consider the effects of our tactics in a single 
precinct. If we use Pv as the value of Pr(VjC) and Pa as the 
probability of voting for candidate A, for a voter drawn at ran- 
dom from the precinct in question, then evidently the expected 
plurality for A in the absence of a canvass is 
PaPvPrN ~ PbPvPrN= (Pa ~ Pb) PvPbN 
If we assume a pure two-party system, where Pa + Pb=1, then 
this reduces to 

(Pa - [1 -- Pa] ) PvPhN= (2Pa -- 1) PvPrN (3.2) 

Now suppose that the entire precinct is canvassed blindly— that 
is, all voters are contacted, regardless of affiliation. Evidently the 
effect of such a canvass is to increase the turnout probability 
somewhat, according to 
Pr( VjC) =Pr( VjC) + a[l - Pr( V|C)] 

=Py + «(l-Pv) 

Hence the expected plurality becomes 
[2Pa- 1] [Pv + a(l-~Pv)]PRN, 

and the net addition to the candidate’s plurality resulting from 
the canvass is 

[2Pa - l][Pv + a(l - Pv)]PrN [2Pa -- 1]PvPrN 

=a[l-Pv][2PA~l]PRN (3.3) 

More generally we can divide this expression by the number of 
voters contacted, N, to obtain the net votes gained per contact 
( or productivity per contact, pi ) for a blind canvass: 
p^=a{l^Vy) (2Pa--1)Pr. (3.4) 

Now suppose that a selective canvass is used, in which only 


'’ibid., especially pp. 72-75. 
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registered A-partisans are contacted. After the canvass, the ex- 
pected plurality is evidently 

(Pv + a[l ~ Pv] )PaPrN ^ Pv(1 -- Pa)PkN 
=a[l ~ PvJPaPrN + (2Pa - 1)PvPkN. 

By subtracting (3.2) we obtain the addition to the candidate's 
plurality produced by the canvass, and by dividing this by tie 
number, PaP»N, of voters contacted we have the per-contact pro- 
ductivity for a selective canvass, 
p2-ce(l~Pv) (3.5) 

Note that pi, unlike pz, can become negative because of the 
(2Pa — 1) term; in neighborhoods where the party has only 
minority support, blind canvassing is counterproductive. Both ex- 
pressions contain a (1 — Pv) term, and hence either type of can- 
vass is relatively more effective in low-turnout neighborhoods, and 
also in off-year elections. 

Finally, let ci and Cz be the costs per contact of conducting blind 
and selective canvasses, respectively, where presumably C 2 > Ci. 
As expenditure on either tactic increases, evidently the gain in 
plurality increases initially with slope pjci or pz/cz* Eventually, 
when all suitable voters in the precinct have been contacted, further 
increases in expenditure produce no additional gains in plurality. 
Graphically, the overall gain-cost relations are as shown in Figure 
3.1 (for a precinct for which Pa > .5). 

Figube 3.1 


Gain 

in 
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4.1 Oin: model enables us to determine the consequences of any 
particular allocation of resources to precincts and tactics, and be- 
cause of the very simple structure of the model, these conse- 
quences can be readily traced out by hand calculation. Our pur- 
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pose in constructing the model was to use it in order to find the 
best, or optimal, allocation; thus, in principle, we might try to 
enumerate systematically every possible allocation, use the model 
to predict the expected gain in plurality produced by each, and 
finally select that allocation with the greatest gain. Obviously such 
an approach is tedious at best, and furthermore there iis no 
assurance that it will ever discover the best allocation, since there 
are infinitely many possibilities to be tried. It would clearly be 
desirable to have an eflScient and systematic method for dis- 
covering the optimal allocation without the necessity of an exhaus- 
tive search of the alternatives. We proceed now to describe such 
a method. 

4.2 In our overall optimization problem, we must decide how to 
allocate our resources across precincts, and also which tactic shall 
be employed in each precinct. Let us first consider the latter ques- 
tion; from inspection of Figure 3.1 it is evident that, in a precinct 
of the type depicted, if the expenditure in the precinct is large then 
the selective tactic will be preferred. The maximum gain possible 
from blind canvassing (when every voter in the precinct is con- 
tacted) is given by the maximum possible number of such con- 
tacts, N, times the per-contact productivity, pi; thus, using (3.4), 
the maximum gain is 

piN=a(2PA - 1)(1 - Pv)PrN. (4.1) 

In selective canvassing the maximum number of contacts is the 
number of registered A-partisans, PaPrN, and the per-contact pro- 
ductivity is Pa, so from (3.5), the maximum gain is 

P2PAPRN = a(l~Pv)PAPRN. (4.2) 

Whenever Pa < 1, evidently this latter expression is larger. When 
expenditure levels are large enough so that the precinct can be 
“saturated," the blind canvass is inferior because it inevitably acti- 
vates some opposition voters, whereas a selective canvass does not. 
Conversely, in Figure 3.1 the blind canvass is better at low ex- 
penditure levels because of its lower cost per contact. Whether 
this is also true in other precincts depends critically upon the 
relative costs per contact, upon die registration rate (since con- 
tacts with unregistered persons, however cheap, are wasted), and 
upon the relative number of opposition voters in the precinct. 
If Pa < .5 then a blind canvass will be counter-productive and 
the selective canvass will be the preferred tactic at all expendi- 
ture levels. In general, comparing the respective per-dollar pro- 
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ductivities pi/cx and P 2 /C 2 , a blind canvass will be preferred (at 
low levels of expenditure) if and only if 

(2PA-l)PK>-5i-- 

Ca 

for the precinct in question. Let fi (Xi) be the function which 
predicts the plurality gain in precinct i produced by spending Xi 
dollars on a canvass with the preferred tactic; graphically, fi will 
be the envelope of the gain-cost functions shown in Figure (3.1), 
represented by the dotted line labeled "f.'' 

Now consider the broader question, of how our canvassing effort 
shall be allocated across precincts. Formally, the problem is to 
allocate the available canvassing budget B to the k precincts in 
such a way as to make the total plur^ty gain F as large as pos- 
sible; thus, 

k 

F = 2^fi(Xi)=maximum, 

subject to (4.3) 

1. S Xi < B 

2. x!^0,i=l,...,k 

This is a familiar constrained-maximization type of problem; how- 
ever, the function to be maximized is not sufficiently smooth to 
permit use of the calculus to find the maximum. Other techniques, 
such as linear programming, do deal with piece-wise linear func- 
tions, such as we have here; unfortunately, however, in the present 
problem some of the payoff functions fi are not concave.® Without 
going into details, this means that (4.3), despite its very element- 
ary structure, in fact constitutes a problem in nonlinear program- 
ming, and a solution procedure would be complicated. To circum- 
vent these difficulties we will modify the problem somewhat, 
making it soluble by a much simpler procedure. The modification 
consists of replacing the true payoff functions fi by new, approxi- 
mate functions ft', which are concave. In precincts where the selec- 
tive tactic is always better, the true function ft is already concave, 
so in this case ft' and ft are identical. Where the blind tactic is 

*A concave function is one which, roughly speaking, obeys a law of diminishing 
returns; specifically a function f is concave if and only if the chord which connects 

any two points of the function lies on or below the function between those points. 
On the relevance of concavity to programming, see any standard text, or, e.g., H, M. 
Markowitz and A. S. Manne, * **On The Solution of piscrete Programming Problems,” 
Econometrica, XXV (1957), 84 ff. 
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better initially, however, fc is not concave, so we replace it by a 
concave function f/; hence in this case the concave approxiination 
F will be related to the true function f as shown in Figure 4,1. 

Figure 4.1 



We can write F as a weighted sum 
f/(Xi) =aii yii -f- ai 2 y la, 

where Sin, a^ are the slopes of the two line segments, and where 

Xi=yii + yia, 

0 < yii < Cl N^ 

0<yla<CaPA'PR^N‘-ClN^ 

In precincts where the selective tactic is always best, evidently 

fi ( Xi ) = ail yii 

Our modified problem is thus 

F' = SSay = maximum 
subject to 

l.S2y<B (4.4) 

2. 0 y ^ Cl N or C 2 Pa Pr N CiN 

where a, y are as defined above. This modified problem is readily 
solved by the following simple algorithm: First, evaluate an, an, 
CiN, C 2 Pa Pr N for each precinct; second, arrange the a"s in order 
of decreasing size 

ftcij)! ^ a(ij)2 ^ a(ij)3 ^ 

and finally, for any budget B, invest in the y's in the same order, 
setting each at its maximum and then going on to the next on the 
list, until the budget is exhausted. When this procedure is com- 
plete, we will have some y^^s which have been set to their mari- 
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mum values, some which are zero, and perhaps one (the last one 
on the list before the budget ran out) which is less than maximal 
but greater than zero. In any precinct, if yi 2 > 0, then we should 
spend yii + yia dollars on a selective canvass in that precinct; if 
yi 2 = 0 and yn > 0, then we should spend yu dollars on a blind 
canvass; and if y^ + yig = 0, no canvassing should be done in that 
precinct (since the money is better spent elsewhere). 

It is straightforward to show that the allocation resulting from 
this procedure is indeed a solution to (4.4); if y*^ is the first y in 
the sequence which has not been made as large as possible^ and a* 
is its slope, their reallocating X dollars into some of the unused y s 
will increase the gain by ^ a*X (since the a for all unused y are 
^ a* ), while the loss produced by withdrawing these dollars from 
earlier y will be ^ a^X (since the earlier a’s are all ^ a^); hence 
no such allocation can increase the objective function, and the 
original allocation is indeed a maximum. 

However, (4.4) involves the concave approximations f' rather 
than the true payoff functions f, and it is possible that a solution to 
(4.4) is still not optimal in terms of the original formulation (4.3). 
Our concave approximations are such that 
f(X)<F(X), (4.5) 

for any X. When X=0, or Ci N, or Ca Pa Pr N (or more precisely, 
where the corresponding y's are either zero or as large as possible) 
then f and F are equal. Let us choose our budget B so that this is 
the case in every precinct, and let Zi,..., Zk be the resulting alloca- 
tion; then since f(z)=F(z) in each precinct, it follows by sum- 
ming over precincts that 

2F(z) =Xf(z) (4.6) 

Now suppose we reallocate in some fashion, so that the allocation 
in precinct i becomes Zi + di dollars, and where 
Xdi^O (4.7) 

(since we are not to exceed the budget constraint). By the argu- 
ment of the last paragraph, no such reallocation can increase the 
concave-approximation payoff; hence 
2F(z+dX2F(z) (4.8) 

However, letting X = z -f- d in (4.5) and summing, evidently 
Xf(z+d)<2F(z+d) (4.9) 

Hence if we combine this with (4.6) and (4.7), we have 
2f(2+d) < 2F(z-f d) ^ 2F(z) = Xf(z), 
so that for this choice of B, no reallocation can increase the tnie 
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payoff 2f(z), For other budget levels it follows from (4.8) and 
(4.9) that the true gain will be less than, or possibly equal to, the 
solution to the modified problem ( 4.4 ) . 

What we have shown, then, is the following: by reformulating 
the original problem we obtained a modified problem which is 
readily solved by a simple clerical procedure. For certain budget 
sizes, the allocation obtained by solving the modified problem is 
optimal with respect to the original problem (4.3) also; for other 
budgets we tend to overestimate the true gain produced by the 
recommended allocation. Even so, however, the ^location is near- 
optimal, in the sense that in only one precinct (or more precisely, 
one of the yij) will resources have been committed in a less-than- 
optimal manner. For practical purposes, then, we have a solution 
procedure for our canvassing model. In section 4.4 we will apply 
the procedure to a hypothetical constituency. 

4.3 The algorithm described above produces canvasses in which 
both inter-precinct resource allocation and intra-precinct tactical 
choice have been optimized. Though the procedure is simple to 
apply, nevertheless it does require more clerical and computa- 
tional effort than would be needed if a simpler canvassing strategy 
were used. A relevant question, therefore, is whether this type of 
formal optimization is worth the extra effort it requires. To gain 
some insight into this question, let us consider some alternative, 
simpler canvassing strategies. The approach described in the pre- 
ceeding section, in which both inter-precinct resource allocation 
and tactical choice are optimized, we shall refer to as “full” opti- 
mization. 

At the opposite extreme, the simplest type of canvass is one in 
which all canvassing is blind, and is conducted upon arbitrarily or 
randomly selected voters. There is reason to believe that a great 
deal of canvassing in American elections is done blindly, and while 
clearly we would not expect any sensible party to select precincts 
at random in the literal sense, by means of a random number table, 
nevertheless it may be that whether a voter is canvassed or not de- 
pends upon factors which are essentially unrelated to productivity 
or eiBSciency, such as the availability of volunteers or a block 
captain locally, or the state of the organization in the precinct. If 
this is so, then taking the voters to be chosen at random is a rea- 
sonable, if rough, representation. We refer to this type of operation 
as “bhnd-random” canvassing. The per-dollar productivity of such 
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a canvass is a weighted average of the blind-canvass productivities 
in each precinct, the weights being the relative sizes of the pre- 
cincts. 

A more complicated but presumably more efficient mode of 
operation is where all canvassing is done blindly, but the precincts 
to be canvassed are chosen optimally. The Democratic canvass 
conducted in Los Angeles County in the 1962 California guberna- 
torial campaign may have approximated this pattern.® To choose 
optimally we compute the blind canvass productivities pi/oi for 
each and arrange the precincts in that order until the budget is ex- 
hausted, or until the productivities become negative, whichever 
occurs first. We shall refer to this as a “blind-optimal’" canvass. 

Still another mode of operation is always to canvass selectively, 
but in precincts chosen at random. The average productivity is a 
weighted average of each of the selective-canvass productivities, 
the weights being the fraction of all registered A-partisans be- 
longing to each precinct. We refer to this type of operation as a 
“selective-random” canvass. 

Finally, we have a “selective-optimal” canvass, in which we 
canvass selectively in the most productive precincts. To select the 
most productive precinct, we rank them according to their selec- 
tive-canvass productivities pjcz and then invest in the precincts in 
that order, until the budget is exhausted. 

To recapitulate, in planning a canvass we must decide which 
precincts to canvass, and which tactic to use in those precincts. 
If the blind tactic is to be used everywhere, then we have either 
a blind-optimal or a blind-random canvass according to whether 
we attempt to choose the most productive precincts or not; if the 
selective tactic is to be used throughout, then we have either a 
selective-optimal or selective-random canvass, again depending 
upon whetiier the choice of precincts is optimized or not. Finally, 
full optimization resembles blind or selective optimization in at- 
tempting to optimize the choice of precincts, but it differs from 
both in that it does not require the same tactic to be used every- 
where. 

4.4 In order to compare these modes of operation we will apply 
each to the constituency described in Table 4.1. The data are 

®Sec Helen Fuller, **The Man to See in California,” Harper’s Magazine, CCXXVI 
(January, 1963), 64 ff. 
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imaginary; however, they were chosen so as to present a plausible 
range of precinct types, and also (by setting Pa > .5 in most pre- 
cincts) so as to make blind canvassing reasonably productive. 


Precinct 

Pa 

Pr 

Pv 

N 

1 

.9 

.9 

.8 

1000 

2 

.9 

.9 

.6 

1000 

3 

.9 

.7 

.8 

1000 

4 

.9 

.7 

.6 

1000 

5 

.7 

.9 

.8 

1000 

6 

.7 

.9 

.6 

1000 

7 

.7 

.7 

.8 

1000 

8 

.7 

.7 

.6 

1000 

9 

.4 

.9 

.8 

1000 

10 

.4 

.9 

.6 

1000 

11 

.4 

.7 

.8 

1000 

12 

A 

.7 

Tab]-e4.1 

.6 

1000 


We assume that blind canvassing costs ten cents per contact 
(which is probably a realistic, though rough, figure), and that 
selective canvassing costs twenty cents per contact (which is a 
guess). To apply the algorithm it is necessary to obtain the slopes 
ail, ai 2 of the two line segments of the concave approximations. 
From the expressions (4.1), (4.2) we can calculate the gains Gi, 
G 2 of saturating any precinct with each tactic, and similarly we can 
calculate the costs Ci, C 2 of doing so. These quantities are tabu- 
lated in columns (2) to (5) of Table 4.2. The productivity of the 

blind tactic is then ; of the selective tactic, ; and 

Cl C2 G2 

Q Q 

of the transition, from blind to selective saturation, These 

L/2 — Gi 

quantities are tabulated in columns (6), (7) and (8) of Table 4.2. 

The slope au of the initial portion of the payoff function f/ of 
precinct i is given by the larger of pi/ci, P 2 /C 2 , as indicated in 
columns (6) and (7) of the table; if the larger is the blind- 
canvass productivity pj/ci, there is a second slope ai 2 , given in 
column 8. To obtain an efficient canvass we invest to saturation 
in order of decreasing productivity (i.e., decreasing ay). The most 
productive opportunity (with a productivity of 1.15 votes per 
dollar) is a blind canvass in precinct 2; thus we first allocate $100 
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to saturate that precinct with a blind canvass. We then allocate 
$100 to the second most productive opportunity (.9 votes/doUar), 
a blind canvass in precinct 4; then $126 to saturate precinct 6 (or 
8, 10, or 12, which are equally productive) with a selective can- 
vass, and so on until the budget is exhausted. If the budget is 
large enough, the final allocation will be to the least productive 
opportunity, a switch from blind to selective saturation of pre- 
cinct 1, which costs $62 and produces only .12 votes per dollar 
spent, for a total of 7.4 additional votes. To determine the overall 
gain we sum the gains produced by each expenditure; thus a bud- 
get of $100 produces 115 votes, $200 produces 205 votes, $326 pro- 
duces 306 votes, and so on. 



Blind 


Selective 

Gx 

Ga 

G 2 — Gi 


G, 

Cx 

G. 

C 2 

Cx 

Ca 

Ca — Cx 

Precinct (Votes) ($) (Votes) ($) 

(V/$) 

(V/$) 

(V/$) 

(1) 

(2) 

(3) 

(4) 

(S) 

(6) 

(7) 

(8) 

1 

58 

100 

65 

162 

.58=3x1 

.40 

.12 = 3ia 

2 

115 

100 

130 

162 

1.15 = &S1 

.80 

.23 = 3aa 

3 

45 

100 

50 

126 

.45 = Bax 

.40 

.22=3«a 

4 

90 

100 

101 

126 

.90=341 

.80 

.43=34a 

5 

29 

100 

50 

126 

.29 

.40=a5i — 

6 

58 

100 

101 

126 

.58 

,80 — slqi — 

7 

22 

100 

39 

98 

.22 

.40=a7i — 

8 

45 

100 

78 

98 

.45 

.80=a8i — ' 

9 

-15 

100 

29 

72 

-.15 

.40 = asi — 

10 

-30 

100 

58 

72 

-.30 

.80 ~ aio,i — 

11 

—11 

100 

22 

56 

-.11 

,40 ~ aii,i — 

12 

-22 

100 

45 

56 -.22 

Table 4.2 

.80 = ai2,i — 


In the blind-optimal type of canvass, we optimize only with 
respect to the blind-canvass productivities in column (6); thus 
we invest in precincts 2 ( 115 votes), 4 (205 votes), 6 ( 263 votes), 
etc. The final allocation, at an expenditure level of $800, is to 
precinct 7; even if the budget is larger we never allocate funds 
to the remaning precincts 9 through 12, since the results would be 
counterproductive. A selective-optimal canvass is handled simi- 
larly, using the selective-canvass productivities in column (7). 

To obtain the per-dollar productivities for blind-random and 
selective-random canvasses we average the productivities in col- 
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umns (6) and (7); the results are .32 votes per dollar (for bud- 
gets $1200) and .60 votes per dollar (for budgets ^ $1280) 
respectively. 

To get a general picture of how these modes of operation com- 
pare in eflSciency we have plotted in Figure 4.2 the expected 
plurality gain produced by each, for budgets up to $1280.^® 

FiGum 4.2 



**FuU Optimization” plot is based, for convenience, on tbe concave apprPxi- 
tions f' ratber than the true yields f. 
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versely, the blind-random canvass is uniformly the least eflBcient. 
If for some reason only the blind tactic is possible, then clearly 
optimization (the B-0 mode) is worthwhile. Except at very large 
budgets the same is also true of the selective tactic; as the budget 
approaches $1,280, however, the S-0 and S-R modes (and the F-0 
mode as well) become identical, since in all cases the recom- 
mended field activity— a selective canvass of the entire con- 
stituency— is the same. At small budget levels, blind optimization 
is superior to either type of selective canvass, and it is better than 
selective-random canvassing even at fairly large levels. However, 
this is in part an artifact of our example; in constituencies not so 
overwhelmingly pro-A in partisanship, B-0 canvassing could be 
inferior to S-0 or even S-R canvassing at every budget level. In 
such constituencies the relative advantage of F-0 over S-0 would 
also decrease, since use of the blind tactic— which is the only dif- 
ference between F-0 and S-0— would be less attractive. The mar- 
gin between S-0 and S-R would remain, since it depends basically 
on the heterogeneity of the constituency, rather than its partisan- 
ship; and the advantage of B-0 over B-R would grow, since in a 
balanced constituency blind-random canvassing would be unpro- 
ductive or even counterproductive. 

5.1 The analysis of the preceding sections has been based on a 
specific model of canvassing. Like any model, ours is a drastically 
simplified representation of a very complex and uncertain process. 
In empirical research of this kind, the investigator is always faced 
with a fundamental choice: whether to adopt a simple and there- 
fore useful model, at the risk of being too simple and hence im- 
realistic; or whether, on the other hand, to employ a more com- 
plicated but more realistic model, which may prove to be too com- 
plex to be of much practical use. In the present study we chose 
a simple model, and as a result the analysis has been relatively 
straightforward. Before accepting it, however, it is important to 
consider the extent to which our conclusions are sensitive to pos- 
sible violations of the assumptions on which the analysis is based. 
For example, we have assumed that certain information is avail- 
able for planning purposes, and that this information is perfectly 
reliable; but what if it is not available, or not reliable? Also, a 
peculiarity of the campaign problem (which it shares with many 
military problems) is that it is a game situation, in which what- 
ever strategy we select will be confronted by our opponent's 
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counterstrategy. What happens to our analysis if, for example, our 
opponent conducts a canvass of his own? Let us briefly consider 
some of these possibilities. 

5.2 We have so far assumed that we are able to make accurate 
forecasts of turnout levels, of the party^s share of the vote in each 
precinct, and of the partisanship of individual voters. On the basis 
of these forecasts we can predict the effects of alternative alloca- 
tions of our canvassing resources, and can therefore identify that 
allocation which is best. How sensitive are our conclusiotis to 
inaccuracies in these forecasts? 

First consider the turnout predictions. Clearly certain combina- 
tions of errors could throw our calculations seriously awry; for 
example, if errors were concentrated in those precincts for which 
one tactic was best, then the effects and the relative efficiencies 
of the various modes of operation could be greatly altered. Such 
malevolent errors are always possible, but other types of error are 
more likely and are therefore of more interest. Forecast e^ors 
might be randomly and independently distributed across all pre- 
cincts; provided they are not too large, the analysis is not sub- 
stantially affected. A more important type of turnout-forecast error 
is where all precincts are affected comparably; for example, good 
or bad weather might cause all turnout rates to be generally above 
or below the forecasts. Suppose that the actual nonvoting rates 
1 — Pv' are j8 times the predicted rates (where > 1 for bad 
weather, jS < 1 for good weather). By inspection of (3.4) and 
(3.5), it is clear that using either tactic will yield /3 times the 
predicted gain, and thus their relative (but not absolute) efficien- 
cies are unaffected. 

Now consider the partisanship forecasts; let us suppose that our 
information is unreliable, in the sense that only a certain fraction 
jS < 1 of the voters actually do vote as predicted, the remainder 
voting for the opposition. If Pa' is the forecast value, then the 
actual value of Pa, taking account of defection from and to party 
A, is evidently 

PA = i8PA'+(l--)8)(l-PA') 

=Pa'(2^-^1) + (1-)8) 

The blind-canvass productivity (3.3) contains a (2 Pa — I) term 
which becomes, in terms of the forecast value, 

2[PA'(2i8- l) + -~1 = 2 Pa'( 2/8-~1) +2^2i8-l 

=:(2Pa'-1)(2)8-^1) 
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Hence the true blind-canvass will be (2j8 — 1) times the forecast 
value. In a selective canvass the defection from A causes it in 
eflFect to become a bhnd canvass with productivity 

a(l--Pv)(2^-l), (5.1) 

which again is (2)8 — 1) times the forecast productivity (3.5). 
Thus both tactics are again aflEected identically, and their relative 
efficiencies are unaffected. 

A more pessimistic assumption (for A) is to suppose that de- 
fection takes place exclusively from A to B, and not in the op- 
posite direction. If only )8 of the forecast A-voters actually do vote 
for A, then the selective canvass productivity is again as in (5.1); 
the blind-canvass productivity, on the other hand, becomes 
(2)8Pa'-1)(1--Pv)Ph. 

Thus the tactics are affected differently. To see what this means 
for our various canvassing modes, let us consider $600 canvasses 
of each type; the forecast and actual effects of each type of 
canvass, conducted in the constituency described earlier, are tabu- 
lated in Table 5.1 for )8=.8. 


Type of 

% Defection; 


Canvass 

02 

202 

F-0 

513 votes 

286 votes 

S-0 

480 

288 

B-0 

410 

200 

S-R 

360 

216 

B-R 

192 

62 


Table 5.1 


All modes are adversely affected by the defection, but the blind 
canvasses B-0, B-R are most seriously hurt. Conversely, had defec- 
tion from B to A occurred, these modes would have taken greater 
advantage of the fact.) Although the selective tactic requires 
more detailed information than does the blind-canvassing tactic, 
nevertheless the modes which use this tactic (S-0, S-R) are not 
more sensitive, and in Table 5.1 are less sensitive, to various kinds 
of error in the information. However, these modes are affected 
seriously if the required information is simply unavailable. If only 
some of the supporters of A are individually known to the party, 
then the various modes will be affected as shown in Table 5.2 
(again for budgets of $600). 
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T)^ of 


% of Supporters Known: 


Canvass 

100% 

75* 

50* 

S-0 

480 votes 

432 votes 

368 votes 

B-0 

410 

410 

410 

S-R 

360 

360 

360 

B-R 

192 

192 

Table 5.2 

m 


For the range of contingencies considered on the table, only; 
the S-0 mode is aflFected; when fewer than 66X of the A-vot^s are! 
known, the B~0 mode is better. If only 25% were known, then the; 
S-R mode could produce no more than 192 votes, the same as thej 
B-R gain; however, the cost would be much less, since a budget 
of $320 would then suffice to contact all known A-supporters. Full 
optimization, though not tabulated in the table, would be superior ; 
throu^out, since the algorithm (with minor modification) takes; 
account of information constraints by making more use of the! 
blind tactic where necessary; in the Hmiting case where no sup- 1 
porters are known, full optimization becomes identical to blind ; 
optimization. 

Another assumption which has been implicit throughout is that ! 
the opposition does not conduct a canvass of his own. Liet us | 
very briefly consider the consequences of relaxing this assumption. I 
First suppose that the opposition contacts J3 of all voters, at i 
random. Then evidently J3 of A’s contacts are in effect wasted, j 
since those voters have already been (or will be) contacted, and I 
by our assumptions a second contact has no additional effect. Thus i 
all tactics and all modes are affected similarly, and the relative I 
efficiencies are unchanged. 

The effects of an opposition selective canvass (at random) are i 
more interesting. Such opposition activity will not affect a selec- | 
tive canvass by party A, since both parties contact only their own 
known supporters. If A conducts a blind canvass, then none of 
the contacts with A-supporters are wasted; on the other hand, J3 
of the contacts with B-voters are. Thus the blind canvass inspires 
fewer additional B-voters to vote, and therefore, oddly, becbmes 
more productive; the actual per-contact productivity is 
(1-Pa)](1-Pv)Pk 

When J8 approaches unity (i.e., when the opposition contacts all 
the B-voters), then a blind A-canvass acts almost like a selective 
canvass, except that some contacts are still wasted on unregistered 
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voters. Table 5.3 shows the effects of opposition selective can- 
vassing upon the different kinds of $600 A-canvasses. 


Type of 
Canvass 

F-0 

S-0 

B-0 

S-R 

B-R 


% of B-voters contacted by opposition: 


0 % 50 % 100 % 

513 votes 528 votes 545 votes 

480 480 480 

410 467 529 

360 360 360 

192 288 384 

Table 5.3 


Even when the opposition canvasses half of its supporters the 
efficiency ranking is unchanged; however, with a 100% canvass, 
B-0 surpasses S-0 and B-R is better than S-R. Full optimization 
remains the most efficient mode throughout. 

6.1 In the preceding three sections, we have suggested a simple 
and general method for planning a political canvass, which seems 
to offer advantages over various simpler approaches to the prob- 
lem, and whose superiority in this respect does not seem to be 
highly sensitive to the specific assumptions on which our analysis 
was based. Whatever the practical relevance of these findings, the 
same method should be equally applicable to the very similar 
problems of telephone and mail canvassing. The same general 
approach, though with differences in detafi, could be used to 
analyze the problems of planning a precampaign, partisan registra- 
tion drive. 

Clearly there are other campaign activities— television activities, 
for example— which are of a wholly different order of complexity. 
Even there, however, there is reason to hope that systematic 
quantitative analysis may become feasible in the not too distant 
future; operational research on marketing problems, for example, 
may lead to results of direct relevance to television campaigning.^^ 
In any event, the use of quantitative methods for policy analysis 
has proved to be fruitful in many different fields, and these 
methods deserve to be more widely known, and used, in political 
science. 


See, for example, J. D. Hcrniter and R. A. Howard, "Stochastic Marketing 
Models,” chap, iii of Hertz and Eddison, op, cit,, pp. 33 ff. 
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4. Models of the Political System 

Introductory Note 

We are entering upon an age of reconstruction, in religion, in 
science, and in political thought. Such ages, if they are to avoid 
mere ignorant oscillation between extremes, must seek truth in its 
ultimate depths. There can be no vision of this depth of truth apart ' 
from a philosophy which takes fuU account of those ultimate ab~ | 
stractions, whose interconnections it is the business of mathematics 
to explore, 

Alfred North Whitehead 

Although Plato believed in mathematics as the key to ultimate 
philosophical truth, he used verbal means to express his models; 
of the political system. The mode of expression remained verbal j 
for tw^ity-three hundred years, and only recently have scholars i 
begun to convert to mathematical expression. The process of con- i 
version has contributed a new rigor to political models. When the| 
scholar attempts to translate his ideas into the language of ihathe-j 
matics, verbal ambiguities are discovered and must be eliminated. : 
Vague ideas must be clarified and reduced to precision, if they are ^ 
to be expressed in mathematical symbols. When this occurs, the; 
scholar better understands his subject.^ 

Whether verbal or mathematical, model-building requires sim- 
plifying assumptions because all variables cannot be identified! 
or controlled. This necessity of simplification is found in models of ; 
the physical, as well as the social, sciences. Any doubt that this | 
is true is quickly dispelled when one contemplates what has hap- 1 
pened to the Newtonian models in twentieth-century physics. In | 
physics, as in other realms, mathematical models are merely ab- i 
stractions, designed to approximate the real world. Despite their | 
imperfections they have been useful to technology, as well as to ^ 
science. Social scientists have the additional problem of the human 
psyche, which gives rise to a considerable variety of behaviors. 
Political scientists have, however, a source of comfort. The psychic 
problem has not caused psychiatrists and sociologists to despair, 
although their disciplines are, in some ways, less amenable to 
precise conceptualization than is political science. 

^ Otto A. Davis, “Final Critique o£ the Conference on Mathematical Application in 
Political Science,” SoutKern Methodist University, Dallas, August 6, 196f. 



164 


MATHEMATICAL APPLICATIONS 


Simplifying assumptions have been well-recognized limitations 
on scientific model-building. In his perceptive coupling of some 
modem political theories with some classical theories, William T. 
Bluhm describes Anthony Downs® and William H. Hiker as 
"strategy theorists,” and he compares their method with that of 
Hobbes. He quotes Chapter VII of The Leviathan: 

No discourse whatsoever can end in absolute knowledge of fact, past 
or to come. For as for the knowledge of fact, it is originally sense, and 
ever after memory. And for the knowledge of consequence, which I have 
said before is called science, it is not absolute but conditional. No man 
can know by discourse that this or that is, has been, or will be, which is 
to know absolutely, but only that if this be, that is; if this has been, that 
has been; if this shall be, tihiat shall be— which is to know conditionally, 
and that not the consequence of one thing to another, but of one name 
of a thing to another name of the same thing. 

Bluhm observes that even though the [conditional] knowledge 
we have always remains knowledge of an abstract world, not a 
real one, 

... if we are good at fitting the right “general names” to the par- 
ticular “fancies” that inhibit our psyches, and if the rules we establish 
correspond to empirical laws, our scientific knowledge provides us with 
a powerful instrument of prediction and control over the world of 
sensible particulars. We can interpret the real world in the lig^t of the 
model, and thus establish power over it. . . . 

[Hobbes asserts] that the theoretical reason is not a device for under- 
standing and contemplating eternal objects, but an instrument for 
manipulating the world of sense, because the world of sense has a logi- 
cal structure to it, susceptible of being known under the categories of a 
model world of “general names.”^ 

Rike/s article on the size principle is a postscript to his im- 
portant volume. The Theory of Political Coalitions, in which the 
size principle is the central idea. Hiker has undertaken no less than 
the "creation of a theoretical constmct that is a somewhat simpli- 
fied version of what the real world [of political coalitions] ... is 
believed to be like.” He suggests that propositions from his model 
can be validated or refuted empirically, and the model in conse- 
quence can be perfected or abandoned. 

^ An Economic Theory of Democracy (New York: Harper & Row, 1957), 
^Theories of the Political System (Englewood CU£Fs, N.J.: Prentice-Hall: 1965), 
pp. 267-270. 
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Game theory provides the conceptual basis for the model. Th0 
coalition theory includes these notions: an ^n-person*" game (more 
than two competitors), “zero sum’^ (gains precisely equal losses), 
rationality of die players. Riker s “iirefutable tautology, following 
Luce and RaifiFa, states: j 

Given a social situation in which exist two alternative courses of action . 
leading to different outcomes and assuming that participants can order ; 
these outcomes on a subjective scale of preference, each participant will ; 
choose the alternative leading to the more preferred outcome.^ 1 

Assuming that “side payments’" are permitted, Riker concludes; 
that winning coalitions tend toward die absolute minimuiti sizej 
necessary for success. He also posits that a long-range result of i 
competition in a political system, which includes these char- ; 
acteristics, is the elimination of participants. Consequently dis- ; 
equilibrium rather than a ‘Tialance of power” occurs. 

Riker’s work has been criticized on the ground that its simpli- 
fying assumptions, particularly the zero-sum assumption, m^ke it ! 
inapplicable to real world politics. The author, however, aUtici- ; 
pates this stricture with a variety of historical evidence to buttress i 
his theory and a perceptive analysis devoted to the question of i 
zero-sum applicability: | 

i 

. . . whether or not one should use the zero-sum model depends en- 
tirely on the way one’s subject is commonly perceived. In discussing 
bargains, which are perceived as mutual gain, of course, a non-zero-sto j 
model is probably best. On the other hand, in discussing elections and 
wars, which are perceived as requiring indivisible victory, the zero-sQm ; 
model is probably best. . . .* 1 

The article by Otto A. Davis and Melvin Hinich belongs to the 
same genre as the works of Riker and of Anthony Downs. There 
are the simplifying assumptions: Candidates have complete in- 
formation about voter preferences regarding issues, and voters ; 
have complete information about candidate positions on all issues. 
These respective attitudes are formed and made known prior to 
nomination and election and are assumed to be unchanging. Each 

voter bases his electoral decision on a rational consideration: he 

■ ■ ■ ■ . . ^ 

^The Theory of Political Coalitions (New Haven and London: Yale University 
Press, 1962, 1965), p. 18. 

^Ibid., p. 31, 
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supports the candidate whose position on the issues appears most 
likely to maximize his (the voter's) utility. Candidates, if elected, 
will adopt the policies aimounced prior to election. The model 
assumes that policies are measured by certain indices and that all 
voters use the same indices. Davis and Hinich demonstrate that, 
given these assumptions, including the known distribution of 
voters on a continuum, a median strategy normally wins over a 
non-median strategy. 

The authors next consider the problem of a party nomination of 
a winning candidate. If a purely democratic nominating process 
is employed and if all voters in the system are members of a party 
in a two-party system, disequilibrium and resort to violence may 
occur, when a minority whose desires differ widely from the views 
of the majority are denied any chance of influencing policy. 

If the policy position of a minority party candidate is preferred 
by enough members of the majority party to constitute a slight 
majority of all voters, the candidate of the minority party may 
win. Thus the model, in its application to the nomination prob- 
lem, suggests that chances of victory in the general election may 
be improved by selection of a candidate whose position is a com- 
promise between the desires of his own party members and the 
members of the other party. 

The analysis reveals a dilemma of nominations: The democratic 
method of choosing a nominee permits rational party voters to 
seek maximization of their utilities by choosing a candidate whose 
position is harmonious with their own. The dictates of general 
election strategy, on the other hand, requires a compromise can- 
didate who can appeal to some members of the opposition party. 
This kind of nomination may be achieved by abandoning de- 
mocracy in favor of a “smoke-filled room" choice. The latter en- 
ables the party to choose a candidate whose position is more 
' compatible with the entire population of voters (in both parties), 
although it may be less preferred by the subset of voters in the 
candidate's own party. 

The reader will note that the Davis-Hinich model, although it 
analyzes “conditional" knowledge, describes and explains mathe- 
matically a number of observable uniformities which are found in 
party systems of the real world. 



A New Proof of the Size Principle 

WILLIAM H. HIKER 

University of Rochester 

In The Theory of Political Coalitions I presented a proof of th0 
size principle, which is an adaption to the world of real coalition? 
of the following inference from the theory of n-person games: 

In n-person, zero-sum games, where side payments are permitted, where • 

players are rational, and where they have perfect information, only mini- ! 

mum winning coalitions occur, | 

The proof of this inference was, however, somewhat involved, so 
I take the opportunity of this paper to present a simpler and more: 
easily understandable direct proof. 

■ ■ -I . I 

As a preliminary step, let me recall for the reader some of the; 
main notions of n-person game theory as set forth by Von Neu-: 
mann and Morgenstem (2). 

In two-person, zero-sum games, the problem faced by each; 
player is the selection of a strategy (i.e., a complete set of choices ! 
for each possible move) such that the player receives an arriount,; 
V, which is the most he can unilaterally guarantee himself andj 
the least his opponent can unilaterally hold him down to. Ini 
n-person games, however, the problem faced by each player, atj 
least in all games where any kind of co-operation is permitted, is | 
less a selection of strategy and more the selection of partners, Pre- i 
sumably two persons co-operating can sometimes accomplish more ! 
than both can acting individually. Hence the main action in n- | 
person games is the formation of coalitions. Even though in the | 
n-person case the problem of play is diflFerent from the problem i 
of play in the two-person case, it is still possible to retain the j 
notion of a value, v, which is the most that can be unilaterally : 
guaranteed. Suppose a coalition, S, forms. Then the worst thing j 
that can happen to it is that its complement, — S, forms. (That j 
is, its complement, — S, can presumably give S more effective 
opposition than can smaller coalitions, P, Q, and R, v^ere ; 
P U Q U R= — S.) If — S forms, we have something like a two- ; 
person game between S and — S and hence can speak of a value | 
for S, v(S), which is called a characteristic function, and which 
is the amount S can guarantee itseH regardless of what — S does 
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and also the amount — S can hold S down to. The characteristic 
function is a real valued set function with the following proper- 
ties: 

(1) v(<jb)=0, where (f> is the empty set. (Presumably an empty 

coalition is valueless.) 

(2) v(S) = — V (— S), which is the zero-sum condition. 

(3) v(ln)=0, where L is the identity subset of the set, N, of 

players, that is, a coalition of the whole. (This property 
is an inference from (1) and (2).) 

(4) v(S U T)^v(S)+v(T), where S and T are disjoint subsets 

of N. When only the equality relation holds, the game is 
said to be inessential (for there is no point to making 
coalitions). Otherwise, the game is essential In the sub- 
sequent discussion we will be concerned only with essen- 
tial games. 

An example of a characteristic function is: 


If S has 


L 


members, v(S)== 


0 

-20 

-40 

40 

20 

0 


In order to render characteristic functions in a form that allows 
easy comparison among games, it is customary to normalize them 
by letting the coalition of a single player be worth a given mini- 
mum, say, — y. That is, 

(5) v([i]) = -y 

Setting — y=— 1, we have the following normalized form for the 
foregoing example: 


If Shas 



membears, v(S) = 



Characteristic functions do not, however, completely describe 
an n-person game. What coimts for the individual player is not 
just the value of the coalition, however much that may be, but 
rather what portion of the value he personally receives. It is con- 
ceivable that player i, whose individual receipts are denoted by 
the symbol “a*,” may prefer a coalition S. to a coalition Si, where 
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v(Si)>v(S 2 ), if Ri > Ri. One must, therefore, describe not only 

4€S2 jeSi j 

the payofiF to coalitions but also the payoff to individuals, whidi 
latter are called imputations; An imputation is an n-tuple of real 
numbers, aj=(ai, az, . . . , an), which satisfies the following con- 
ditions; 

(6) ai^v([i]), which asserts that no player will accept in an;^ 
coalition an amount less than he can obtain in a cpaHtion 
of himseff alone: and ! 

i 

. « .. i 

2 1 ' 

ai=0, which is not only the zero-sum condition, but alsp 

I 


asserts that rational players, whatever their 
structure, will obtain the full value of die game. 


n 



iTie task of n-parson theory is to place some limitations on botk 
characteristic functions and imputations in order to render i^e out- 
comes predictable. Von Neumaim and Morgenstem initiated tU^ 
process with a discussion of the range of characteristic function^. 
Specifically, th^ showed; 

(8) if S has 0 members, v(S)=0. (from(l)) | 

(9) if S has 1 member, v(S)= — y. (from (5)) : 

( 10 ) if S has ( n — 1 ) members, v( S ) =y. ( from (2) and ( 5 ) ) 

(11) if S has n members, v(S) =0. (from (3) ) ^ 

(12) if S has p members, where 2^p^(n — 2), ! 

then — py;^v(S)^(n — p)y. (from (6)) ; 

Graphically these results can be shown thus: 

Figure 1 



170 


MATHEMATICAL APPLICATIONS 


where points (0,0), (1, — y), ((n — l),y), and (n,0) represent 
assertions (8), (9), (10), and (11) respectively and where the 
vertical lines represent assertion (12). Since Von Neumann and 
Morgenstem did not wish to use the notion of a majority (because 
they wished to allow for weighted players who, though fewer than 
a numerical majority, might win and because they wished to 
allow for discriminatory solutions in which some players were 
guaranteed minimum gains and losses), they could not narrow the 
range further. One can, however, use the notions either of a ma- 
jority of equally weighted persons or of a majority of equal units 
of weight, thereby preserving the feature of weights while per- 
mitting much further narrowing of the range of characteristic 
functions. (Here we will be concerned only with majorities of 
equally weighted persons; but for a presentation of the majority 
notion in terms of units of weight, see reference (1), pp. 253-61.) 
In so doing, we are, of course, limiting ourselves to nondiscrimina- 
tory solutions for, if the notion of a majority is used, discrimination 
can appear only as unequal weighting. 

Let m be the minimal value of a majority, where 

(13) p - " -or ( j (Note that the right in- 

equation is written “m<n” rather than as is 

often customary. If m=n, there is nothing to do in the 
game except form the single coalition, L, of all players, 
which fact renders characteristic function theory trivial. ) 
The following definitions can now be offered 

(14) if p>n — p and then Sp e W, where W is the set of all 

winning coalitions; if Sp € W, then v(S)^0; if p^n, then, 
for SeW, v(S)>0, which follows from (4) since we have 
assumed the game is essential, 

(15) ifp=m, then SpcW^, where W’" is the set of minimal win- 
ning coalitions such that Sp— 1 i W“. 

( 16 ) if (n — p ) ^p<m, Sp € B, where B is the set of blocking coali- 
tions; and v(S)=0. 

(17) if S W and S ^ B, then S € L, where L is the set of losing 
coalitions; v(S)^0; if p>0, then v(S)<0, which follows 
from (2) and (14). With these definitions it is possible to 
rewrite (12), narrowing the range of v(s): 

(18) if S € W, then 0^v(S)^(n— p)y. (from (12) and (14) ) 

(19) if S eB, then v(S)=0. (from (16)) 
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(20) if S €L, then py^v(S) ^0. (from (12) and (17))! 

Ignoring the possibility of blocking coalitions, the results can be; 
shown graphically thus: 

Figure 2 

PVn 



Even though the range of characteristic functions has thus been 
narrowed largely by eliminating discriminatory solutions, we stiU 
know relatively little about what coalitions might occur and about 
what imputations might be associated with them. In this sec- 
tion, I shall set forth another kind of restriction on coalitions ^hich 
permits a prediction about the range of occurrenes and which is 
sometimes useful in political analysis. 

We can assume, of course, that, since the game is essential, some 
S, S e W, and some — S, — S eL, occur (ignoring here the possi- 
bility of blocking coalitions). If, however, there exists soirie S«, 
Sq € W, and some imputation, a, associated with Sq, such that 
can guarantee its members more than they might receive in a 
smaller coalition and at least as much as they might receive in a 
larger one, then they would prefer coalitions of size q to aU others. 
Such coalitions, Sq, are realizable, while all others Sp, where p^q, 
are unrealizable. Presumably, once a coalition reaches a realizable 
size, it is relatively stable, except of course for internal squabbles 
over the division of v(S) into at, 

< € S 

The intuitive idea in the notion of realizable coalitions is that, 
in the set W, there is a subset of realizable coalitions, W®, such 
that any coalition in W® is preferred to any coalition not in it be- 
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cause in S, ScW’, the amounts that S can unilaterally (that is, 
without the co-operation of — S ) guarantee its members individu- 
ally are at a maximum and, for that maximum, the costs of organi- 
zation are minimal. 

Stating this notion formally: For Sp, S, and S^, where p < q 
< r, is realizable if, for e W, it is possible that 
, (I) af" < af«; and 

' (II) af’-saf'-, 

where the notation means “the payment to i when i is a 
member of S/’ and x = {p,q,r}. 

The theorem to be proved is: == W"*. That is, only minimal 

winning coalitions are realizable. In the proof, I shall show, 
first, that S, SeW"* fulfill condition (I), second, that they fulfill 
condition (II), and third, that they alone of all S, ScW, fulfill 
both conditions. 

First, let there be two sets Sp and S^, where p < m, qr = m, 
and Sp is a proper subset of S^. Here it is always true, by reason 
of (14), (17), (18), and (20), that v(Sp) < v(S<,),for v(Sp)is a nega- 
tive number or zero while, when 9 = m, v(S^,) is a positive one. 
Hence the amount MS^)— v(Sp)] can always be divided 
among the i, i e Sp and S„ and the j, j € Sg, j 4 Sp, in such a way 
as to guarantee that afv > 0 and < af^ ♦ (That is, Sp, by turning 
itself into a minimal winning coalition, Sg, can increase its 
value sufficiently to pay all its old members more than they 
receive in Sp and to pay its new members something for join- 
ing.) Hence S, S e W”*, satisfies Condition I of being realizable. 

Second, let there be two sets Sg and S^, where m = 9 < r, and 
where Sg is a proper subset of S^. Here it is possible that 
v(Sg) ^ v(Sr). So it is necessary to prove that S, S 6 W”*, meets 
condition (II) in each of three cases: 

Case 1. v(Sg) > v(Sr). Since v(Sg) > v(Sr), it is always possi- 
ble to form Sg in such a way that, for i € Sg and Sr, 
qSq = af'- + df 2 + d# [v(Sg) - v(Sr)] 

hiSQ 

where ^2 d,- == 1 and d# > 0. (That is, by reducing from Sr, the 
members of Sg can keep what they get in Sr, divide up the pay- 
ments made in Sr to the people ejected when Sg was formed, 
and divide up the increase in value.) Thus it is possible that 
af® > af’’. * 
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Case 2. v(S,) = v(Sr). As in case 1, it is possible to foitn S, sp 
that, for f € S, and Sr, 

af« = af*- + d, 2 a|^ 

»<s« i 

Hence, as in Case 1, it is/possible that af® > af»'. 

Case 3. v(S,) < v(Sr). In this case, there are three subcases! 
according to the size of the sum of the payments to h, h e Sr,! 
Itmay be that^ S^af'- |v(Sr)- v(S,). j 

Casd 3.1 2 ag'' > v(Sr)—v(S,). Since [v(S,)+ 2 af’']> v(Sr), 

iliSq liiSq 

it is possible to fprm S, in such a way that af« > af*", i € S, and Sr- ; 
For example, if af’‘ = dj (v(Sr)), then let a^= dj [v{S„) + 2 af'’]. ! 

Case 3.2 2 a®*' = v(Sr) — v(So). By the condition of this case, 

A < Sg 

that [v(S^+^2 a®’’] = v(Sr), it is possible that &i be chosen so 
that at least af* = a^<‘. 

Case 3.3 < v(Sr)- v{S,). Let h = v(Sr)- v(Sj)-^2^af'-. 

Then b is the amount that all i, i c S^, might gain from enlarging 
the coalition S, to Sr. Let c = v(Sr)— v(S,). Then c is the ampunt 
that —Sr can afford to offer he Sr, hiSg^ to form —S^. If 

2 a^'* > 0, then c > h. If 2 af'* = 0, then c == fo. We can expect 

from assertion (7) that — Sr will bid for h up to the amount c, 
which will then bring about the situation in which af'’ — a?«= 0. 
(Only a kind of altruistic good will on the part of —Sr can 
prevent this result; and such good will is not to be expected in 
a zero-sum game.) Hence = af'*. In this case if is argued that, 
if Sr is formed, it is not realizable because —Sr can by appro- 
priate bidding force the formation of S^, S, € W”*. Then there 
are the following discernible entities: 

(1) Sq; (2) those members of — S^ for which a# is a negative 
number; and (3) those members of — Sq for which a^ is a positive 
number or zero. 

In sub-cases 3.2 and 3.3, af^ = af'' and in sub-case 3.1 and 
Cases 1 and 2, a?^ > af Hence condition (II), that af® ^ af ^ is 
fulfilled by S, S € W”*. 

Third, it remains to show that only S, S € W”* fulfill both 
conditions. Since in cases 1, 2, and 3.1 of the proof that S, 
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S € W*” fulfilled condition (II), it was shown that could satisfy 
the inequation > af'’, and since it had already been estab- 
lished that af« > af^ , it follows that in these cases, only S, S € W”* 
satisfies both conditions. But in cases 3.2 and 3.3 it was shown 
only thataf*' = af ^ Since it is thus possible that, for some choice 
of r, r ^ q, af'*^af'‘‘*'\both Sr and S, are possible candidates 
for fulfilling condition (II). But since af'* = where q < r, 
clearly Sr does not fulfill condition (I) which requires that 
af'' > sLp, But when q = m, S^, does fulfill condition (I), even in 
cases 3.2 and 3.3. Hence only S, S € W”* fulfills both conditions 
simultaneously. Thus only minimal winning coalitions are 
realizable. That is, = W"*- 


IV 

The size principle does not by any means solve all the prob- 
lems connected with n-person games. Since in a simple majority 
game where players are weighted equally, there are 1, 2 ,...., t 
possible coalitions in W®, where t = (*”), it is apparent that the 
size principle does not narrow the selection down to a unique 
coalition. (As 1 have shown in (1), pp. 127-39, however, in some 
simple majority games where players are weighted unequally, 
W** is a single member set.) Furthermore, the size principle 
tells us very little about imputations, except that given some 
payoif to i in Sp or Sr, the payoflF to i in S^ is equal to or better 
than the payoff in Sp or Sr. Finally, since the narrowing that 
permitted the size principle eliminated games in which 
particular players are specially favored or disfavored, it tells 
us nothing about games in which discrimination is permitted. 
In short, there are many non-unique features of predictions in 
n-person theory. Nevertheless, as I tried to demonstrate in (1), 
the narrowing accomplished here permits some new and 
revealing interpretations of politics. 
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A Mathematical Model of Policy 
Formation in a Democratic Society ‘ 

OTTO A. DAVIS and MELVIN HINICH 

Carnegie Institute of Technology 

1. Introduction 

It is obvious that there are many factors which influence the 
policies adopted by a democratic government Close observers of 
the political scene easily can cite instances where the very com- 
plexity of the governmental organization allows one part of that 
entity to have policies which serve to frustrate the policies of 
another part. It is equally clear that instances exist in our complex 
system where some policies of some parts of our government are 
unknown to our elected leaders (not to mention the people). There 
is no doubt but tiiat any truly general and complete theoiy of 
policy formation should explain such anomalies. Nevertheless, they 
are ignored in the developments which follow. Instead of these 
anomalies, attention is centered on an idealized situation where 
full knowledge of governmental policy is available to all. 

It is also evident that in a democracy where a government 
enjoys power because it won an election, that government s 
policies must bear some relationship to the desires of the voters. 
The determination of this relationship is the problem with which 
this paper is concerned. Nevertheless, it should be admitted at the 
outset that the very concept of the "desires of the voters’^ is some- 
what ambiguous. Although it cannot be denied that some n^em- 
bers of the population (and perhaps all of the relevant popula- 
tion for some subset of issues) have clearly defined positions on 
policy, evidence reported by various pollsters would seem to indi- 
cate that, at least for some issues, the very debate connected with 
an election may have an influence upon public opinion. Partly be- 
cause such an influence does not seem to be fully understood, this 
phenomenon also is omitted from the model developed here. Per- 
haps the sole justification of this and the above omissions is that 

^This research was supported by a grant from Resources for the Future tQ the 
Graduate School of Industrial Administration, Carnegie Institute of Technology. V^hile 
only the authors are responsible for the contents of this paper, the comments of Morris 
DeGrpot, Carnegie Institute of Technology, and Aaron Wildavsky, University of Cali- 
fornia at Berkeley, are gratefully acknowledged. 
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one must leam to walk before one is able to run. Yet, these omis- 
sions mean that this paper should be viewed as an effort to study 
only one idealized aspect of the real situation. 

The particular (and main) problem investigated here is as fol- 
lows: Given the precisely defined (see the developments below) 
and unchangeable preferences of the voters in the population, 
candidates for public office compete for votes by announcing be- 
fore an election their exact position on each of the relevant issues. 
Each voter compares the positions taken by the various candidates 
and casts his vote for that particular can^date whose position is 
"nearest” (a more careful definition is given below) his own most 
preferred position. It is assumed that, once elected, a (former) 
candidate will adopt those policies which he annoimced during 
the campaign. Thus the questions to be answered are whether, 
and under what conditions, dominant strategies exist for the can- 
didates. 

Other problems also are analyzed within this context. For ex- 
ample, the policy choice of a beneficient dictator is compared with 
the dominant strategy for two candidates in a democratic system. 
The dilemma inherent in the process of nominating a candidate is 
discussed. Finally, a basic assumption is relaxed to allow for the 
possibility that one portion of the population may not care about 
some particular subset of issues while the other portion feels 
strongly about these very issues. 

2. Basic Assumptions and Tools of Analysis 

In order to be able to handle these basic problems, it is neces- 
sary to make some simplifying assumptions. First, it must be pre- 
sumed that, at least conceptually, policies can be measured by 
certain indices. Consider, for example, the issue of civil ri^ts. One 
might use several indices to measure the various characteristics 
of this issue. Voting rights might be measured by the percentage 
of the adult, nonwhite population which can be registered to vote. 
Integration in the schools and in housing might be measured by 
the variance of the percentages of nonwhites attending the various 
schools and living in the various localities respectively. Job dis- 
crimination might be measured by the percentage of nonwhites 
employed in various categories of work. On the other hand, one 
might use an index of these various indices. The crucial point is 
that some type of measurement be admitted. 

Granted that policies can be measured by the postulated indices. 
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another (even stronger) assumption is now appropriate. It i§ that 
each voter in the population uses the same indices to measure any 
given policy. In other words, the indices measuring the various 
policies are comirion to aH voters. It is apparent that this assump- 
tion means that since the number of variables which measure any 
policy issue is soniewhat arbitrary, all voters in the populatiO)(i are 
assumed to have the same given degree of sophistication iri the 
manner in which they view policies. 

It is assumed further that each voter has a preferred positiop for 
each issue of policy. This preferred position can be represented by 
certain values of the variables which measure each policy. Con- 
sequently, the i*^ voter^s preferred position on all the issues of 
policy can be represented by the vector 

Xi==[Xii, Xi2, . . . , XiJ' 

where the components of the vector Xi represent the desired values 
of the indices which measure the given policies.® Thus Xn n|ight 
represent the percentage of the adult, nonwhite population which 
can be registered to vote; Xia and x^ mi^t measure respectively 
the variance of the percentages of nonwhites attending the various 
schools and living in the various localities; etc. 

In a manner similar to which the preferred position (or point) 
of an individual voter is represented, the vector 

0j=[0jl, 6j2, . . . , dinY 

can be taken to represent the position (or ‘platform’’) of the 
candidate. The column vector is presumed to be announced be- 
fore any election and is knovm to all voters. 

Since only in a degenerate case could Xi=0j for all i, some pro- 
vision must be made for the “loss” which any voter feels when his 
preferred policy position is not the one selected for enactment. 
Such provision can be accomplished by the introduction of in- 
dividual loss functions. Obviously, loss functions should exhibit 
certain intuitively desirable properties. Let 6 represent govern- 
mental policy. Then 0 is a vector composed of the indices dis- 
cussed above. For the moment, view the components of 6 as v^- 
ables. Consider the i**" voter. Obviously, if Xi=0, then this inidi- 
vidual’s loss should be zero since governmental policy is the same 
as his preferred position on all issues. However, consonant with 
the notion that each individual does have a preferred position for 

® The prime (') on the explicit vector on the right hand side of the equality denotes 
the operation ^'transpose.” Thus x^ is a column vector. 
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each issue of policy, then if Xi ^ 0 the i”" voter should have a posi- 
tive loss. These properties are present in the following specifica- 
tion of the i**" voter’s loss function: 

(2.1) Li(0) = (Xi--0)'A(xi^0) 

where Li represents the loss function and A is a symmetric, posi- 
tive definite matrix of rank n.® 

Observe that (2.1) is a quadratic form. Obviously, the specifi- 
cation of this specific form requires further justification, since other 
functions possess the two properties discussed above. However, a 
quadratic form is the simplest of the class of functions having 
these properties and it is preferable, other things being equal, on 
this basis. Second, a loss function has an obvious relationship to 
the economist’s notion of a utility function and, in fact, a quadratic 
loss function can be derived from a quadratic utility function. The 
basic notion underlying utility analysis is that of declining mar- 
ginal utility. A quadratic utility function incorporates this concept. 
It follows that a quadratic loss function is acceptable on this basis. 
Third, it can be argued that no matter what Ae “true” loss func- 
tion ( at least if it incorporates the properties specified in the above 
paragraph), then a quadratic can serve as an acceptable approxi- 
mation. This argument can be based upon expanding the function 
in a Taylor’s series, noting that the first order terms are zero if the 
loss is symmetric, and throwing away the third and higher order 
terms. Finally, the authors argue that the proof of the pudding is 
in the eating and that intuitively interesting and informative re- 
sults can be derived on the basis of quadratic losses. 

For the special case of n=l (one issue with a single index) the 
loss function (2.1) is plotted in Figure 1. Note that it is symmetric 
around the point of zero loss (Xi=0). 

It should be pointed out that the matrix A in (2.1) is not given 
a subscript. The reason for this omission is a rather strong assump- 
tion. Although the components of the vector Xi can assume any 
values which the i"" individual might desire, it is presumed that 
the tastes of the voters are such that the matrix A enters the loss 
function of each individual. The population of voters is assumed 


®By definition, a nX^i matrix A is symmetric if it is equal to its transpose; that 
is, if A =: A'. The assumption that A is positive definite is a sufficient condition for 
the property Lj (0) ^ 0 for all 0 and Lj (0) = 0 if and only if x^ = 0. A 
necessary and sufficient condition for A to be positive definite is that the naturally 
ordered principal minors of A are all positive. See, e.g., G. Hadley, Linear Algebra 
(Readin**;: Addison- Wesley, 1961 ), pp. 251 - 63 . 
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to liave a certain “homogeneity.” In other words, althou^ voters 
desire difiFering values of the indices of policy, all voters assign the 
same relative “weight” (or “importmice”) to any given issue.^ This 
(admittedly unrealistic) presumption is made solely for the reason 
of analytical convenience. It should be noted, on the other hand, 
that since utility losses are highly personal matters with no rater- 
individual scale of measurement which has meaning, care must 
be exercised in attaching any significance to a comparison of the 
numerical values of the losses of any two given individuals. How- 
ever, the very notion that utility functions are unique only up to a 
monotonic transformation provides a (somewhat weak) rationali- 
zation to the assumption that the matrix A enters the loss of func- 
tion of each individual. At least for a class of loss functions, suit- 
able transformations could be performed on these functions so that 
this assumption could be satisfied. 

Finally, there is the problem of characterizing the population of 
voters. Granted the previous assumptions, this can be accomplished 
by presuming that the preferred positions of all voters have been 
plotted into an n dimensional frequency and that this frequency 
has been suitable normalized into a density f(x). While this den- 
sity is naturally discrete, for the most part it vdll be approximated 
by a continuous density. It should be noted that this method of 
characterizing the population gives one access to the tools of 
probability theory for the purpose of analysis. 

As a matter of notation, it is presumed that 

(2.2) Ex=8 

where E represents the operation of taking expected values so that 
8 is the vector whose components are the means of the components 
of the Xi. Also, 

(2.3) E(x-8)(x-~8)'=^ 

where ^ is an nXn matrix whose diagonal elements are variances 
and non-diagonal elements are covariances. 

It is convenient to introduce the notion of the norm. Let z 
represent an n component vector. The norm of the vector z vrith 
respect to the matrix A is defined as follows: 

(2.4) II z II = Vz^aF 

The norm represents the “length” of the vector z. Since A is posi- 
tive definite, then z =4= 0 implies 1 1 z [ | >0. The norm also repre- 


See, however, the discussion of Section 6 where this assumption is modified slightly. 
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sents the “distance” between two vectors. For example, the norm 

(2.5) II Zi — z» II = V(zi — Zs)'A(zi — Zj) 

represents the “distance” between vectors Zi and Zs with respect to 
the matrix A. In the developments below, a norm with respect to 
the matrix AjSA is used also. This norm is distinguished by the 
notation 

(2.6) [| z 11* = Vz'AjSAz 

3. Two Candidates and a Beneficent Dictatorship 

It is convenient and appropriate, before analyzing the basic case 
of electorial competition between two candidates, to consider the 
policies which a wise and beneficent dictator might choose. In 
this way, the dictator's choice can be compared with the policies 
resulting from the competition of the democratic process. 

First of all, it is clear that the dictator must make some assign- 
ment of the weights of the importance (to the dictator) of the 
losses suffered by the various individuals in the population. Sup- 
pose that the dictator decides to weigh all individuals equally and 
makes the value judgment that utility losses are interpersonaUy 
comparable. Thus he decides that his policies should be chosen so 
that the average loss is minimized. In other words, the dictator 
desires to choose a vector 0 such that the expression 

(3.1) E(x-e)'A(x-0)=E|| x-^0 |p=trM+ H ^-8 j|^ 
is minimized.® It is clear that this expression is at a minimum when 
6 is chosen such that 6=d. In other words, granted the dictators 
value judgment, and also granted his desire to minimize that 
quantity which he perceives to be the total of the utility losses in 
the population, he must choose his policies to be the average of 
the preferred positions of the individuals in the population. 

Turning now to the case of two-candidate competition in a 
democratic society, it is convenient to begin by stating that the 
candidates will be called “one" ant “two" respectively so that 6i 
denotes the platform of the candidate and 02 represents the 
platform of the 2“’® candidate. These platforms are annoimced be- 

®In this notation, (x— 0) = [[ x— ^ |P by definition. In other words, 
the quantities are the same except for notation. Thus || 0— 8 ||® = (0— 8)'A(0— S). 
The trace of a matrix (denoted **tr” above) is defined as follows; Let B represent 
an nXn matrix whose elements are denoted b^. Then n 

tr B =: S by. 

i = 1 

In other words, the trace of B is the sum of the diagonal elements. 
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fore the election day and form the basis for 
between the two candidates. (Recall the convenient assumpti^ 
that the elected candidate will honor his platform. ) Essentially, a 
voter is assumed to choose that candidate whose platform ^ves 
the smallest utility loss. In other words, the i**" voter will cast; his 
ballot for the 1 ** candidate if 

(3.2) (xi — 0i)'A(xi — 0i) < (Xi — 02)'A(Xi — © 2 ) 
and it obviously follows that if 

(3.3) (Xi — 0i)'A(Xi — di) > (Xi — 0a)'A(Xi — © 2 ) 

the 2"^ candidate will receiye the i*** in^yid^al’s vote. In the^ im 
likely event that the utility losses are the same, it can be presumed 
that the voter makes his choice by flipping a fair coin. 

Having developed the rules for a voter's choice of candidate, it 
is appropriate to consider the relationsliip between this analysis 
and the works of Hotelling,® Downs, ^ and Tullock.® The unifying 
elements in the rejev^t parts of these works are two presump- 
tions. First, there is only one index of policy. Second, distance can 
be used to determine how a voter will cast his ballot. Thus iij, ^ 
terminology of this analysis, let n=l. Then a representative loss 
function is presented in Figure 1. Given the previous assumptions, 
this function must be symmetric. It follows that (3.2) obtains if 
and only if Xi — | < | Xi — 02 | and (3.3) obtains if and only 

if I Xi — 01 > I Xi — 02 1 . In other words, a voter chooses that 

candidate whose platform is nearest to his own (the voter's) pre- 
ferred position.* 

Consider the number 0* which satisfies the following conditions: 

P(x< 0 *)<K 

(3.4) 

P(x^ 0 *)>l^ 

where P represents "probability." In other words, 0 * is the (not 
necessarily unique) median of f(x). 

Consider now the problem of the choice of platforms. Suppose 
that the T* candidate selects the platform 0 i= 0 * and the 2 “* can- 
didate selects sorae platform 02 7 ^ 0 * where 0 * represents any 

* Harold Hotelling, '‘Stability in Competition,” Economic Journal, XXXDC (1929), 
41-57; reprinted in G. J. Stigler and K. E. Boulding (eds.), Readings in Price Theory 
(Chicago: Richard D. Irwin, 1952), pp. 467-484. 

’^Anthony Downs, An Economic Theory of Democracy (New York: Harper, 1957). 

* Gordon Tullock, The Politics of Bureaucracy (Washington: Public Affairs Press, 
1965). 

*The notation!. I denotes “absolute value.” In a single dimension, this is a measure 
of di^r-tir#*. 
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number which satisfies (3.4). Put another way, the 1®* candidate 
chooses a median strategy while the 2”^ candidate selects a non- 
median platform. Given Aese choices, it is clear that (under the 
assumptions) the 1®^ candidate will win the election. In other 
words, the median is a dominant strategy. A choice of the median 
insures a candidate of at least an even chance of winning. 

In order to justify this theorem, it is suflScient to observe that 
the very definition of the median (3.4) insures the 1®* candidate 
of having a platform nearer to the preferred positions of at least 
one-half of the voters than the platform of the 2"^ candidate. This 
fact is also obvious in Figure 2 where a density f(x) is drawn, 
represents the median, and 6%^ O'* is an arbitrary choice of the 
other candidate. 

Given the presumed voting rules (3.2 and 3.3), it is clear that 
the best that the 2"^ candidate can do is also to select a median 
strategy 02=0*. In this event both candidates have an even chance 
of winning. 

The dominance of the strategy of playing the median means that 
insofar as candidates are interested in winning the election, they 
should try to achieve this "‘middle position.” Non-median strategies 
are to be avoided, for they only iuvite defeat. (At least under the 
assumptions made here, which implicitly include the presumption 
that all qualified individuals vote.) 

Contrast this result with the presumed choice of a beneficent 
dictator. When the density f(x) of preferred points is such that 
the mean and median coincide, then the dominant strategy for a 
candidate is the same as the beneficent dictator s choice. How- 
ever, if the density f(x) is skewed so that the mean and median 
are not the same, then the choices differ. 

The question arises as to whether this result can be extended. It 
is particularly interesting to inquire as to whether anything can 
be said when the number of components in the vector x is greater 
than one. In this regard, let n > 1 be an arbitrary integer. This 
means that f(x) is a multivariate density. It is necessary to per- 
form a certain amount of algebraic manipulation to get the voting 
rules into a form which is useful for analysis. 

Consider the instance in which the i^ individual votes for the 
1®* candidate so that (3.2) is presumed to obtain. Dropping the i 
subscripts for convenience, it is easily seen that (3.2) can be ex- 
panded into the following equivalent statement 

( 3.5) x'Ax — 2x'A6i + 0iA0i < x'Ax — 2x'A02 + 0a'A6a 
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since x'Adi=0/Ax and x'Ads=0/Ax. By taking 6/A6t to the left 
hand side and 2x^A0i to the right hand side, the expression 

(3.6) 6/A6x- 6/A0.<2x'A(0,-6,) 
is obtained. This can be written as 

(3.7) (e^ + 6.) 'A(0x - e.) < 2x'A(6t - 6^) 

and by subtracting 28'A(0t — O-,) from both sides, the expression; 

(3.8) (6i + 0O'A(6i - 0^) - 28'A(0x - 0,) < 

2(x — 8)'A(0, — 00; 

is obtained. Examine the left hand side of this ej^ression (3.8).: 
The following is simply an algebraic manipulation. 

(3.9) (0x + 0O'A(0x-0O -28'A(0x-0O = 

(01 + 0i - 28)'A(0i -00= [(01 - 8) + (0. - 8)]'A(0i -- 00 ; 
Obviously, simultaneously adding and subtracting 8 does not alter ; 
the value of this expression. Thus one can write (3.9) in the form ; 
[(01 - 8) + (0. - S)]'A[0i - 8) - (0, - 8)] = 

(3.10) (0i_8)'A(0i-8) -(0i-8)'A(0.-8) = 

ii^‘-8ir-ii0*-8ir 

and the last part of this step is nothing more than the notation 
introduced in (2.5). It is easily observ^ from (3.10) and (3.8) 
that if (3.2) obtains, then 

(3.11) 2(x-8)'A(0i-0O > |( 01-8 II* -II 0.-8 || * 
also holds. In other words, the i“ individual votes for the 1“ can- 
didate if ( 3.11 ) is true. 

For the moment, consider x to be a vector selected at random 
from f(x). Then it is useful to know the mean and variance of 
one half the quantity on the left hand side of (3.11). 
E[(x-8)'A(0i-0.]=O 

(3.12) 

Var [(x - 8)'A(0i - 0.)] = (0i - 0.)'A?5A (0. - 0.) 

The following definition (see (2.6) ) is simply a matter of notation. 

(3.13) V(0i-0,)'AM(0i-0.) = |I 01-0. II* 

In other words, || 0i — 0. ||* is simply the standard deviation of 
(x — 8)'A(0i — 0.) when x is considered to be a vector selected 
at random from f(x). 

Consider the following definitions. 


(3.14) 


(x-8)-A(0i-0.) 

^ Ij 01-0. II* 

II ei-8 II*- II 0,-8 II* 
2 II 01-0. II* 


Then the statement 


(3.15) y > t 
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is equivalent to statement (3.11). In other words, those individuals 
for whom (3.15) is true will vote for the 1®* candidate. 

Expression (3.15) is useful for analysis. It is desired to investi- 
gate the possibility of the 1®* candidate being able to select his plat- 
form (policies) 01 in such a manner that he is certain to win the 
election if Bi^Bz. (Note that if Bi—Bz, then neither (3.2) nor 
(3.3) obtain and the election is equivalent to tossing a coin. ) Con- 
sider selecting a voter at random from the population f(x). If 

(3.16) P[(x-0O'A(x^0i) < (x~60'A(x*-62)] >3^ 

so that more than one half of the voters in the population obtain 
a smaller utility loss from the 1®* candidate's platform than from 
the one of his opponent, then the 1®* candidate is certain to win 
the election. The previous analysis shows tiiat 

(3.17) P(y>t)>J^ 

is equivalent to (3.16). Furthermore, if f(x) is continuous so that 
for Bi ^ 02 

(3.18) P[(x 0O'A(x ^ 00 = (x - 02)'A(x - 02)] =0 
then the 1®* candidate wins if and only if (3.17) obtains. 

It is now necessary to inquire into the conditions under which 
(3.17) is true. Suppose that f(x) is a multivariate normal density 
with mean vector 8 and variance-covariance matrix jS. Then it is 
clear from (3.12) and the definition (3.14) of y that y has a 
standard normal distribution. Thus (3.17) is true if and only if 
t < 0. 

Examine the definition (3.14) of t. Suppose that the 1®* candi- 
date selects 01=8. Obviously, || 8 — 8 ||=0. Then for any choice 
of the 2"* candidate such that 02 ^ 8, 1 1 02 — 8 1 1 > 0. It follows 
that t < 0 so that (3.17) is true. In other words, if the 1®* candi- 
date selects the policies in his platform to be exactly the same as 
the mean of the policies desiroJ by the individuals in the voting 
population, and the other candidate does not make the same 
choice, then the 1®^ candidate is certain to win the election. Con- 
versely, suppose that the 1®* candidate selects 0i ^ 8. Obviously, 
11 01 — 8 1 1 > 0 in such an instance. If the 2"** candidate selects 
02=8, then |[ 02 — 8 || =0 so that t > 0. Thus 

(3.19) P(y >t)<J^ 

is true so that the 2“* candidate is certain to win the election. 
Finally, if both 0i=8 and 02^8, then it is obvious that a tie is ex- 
pected. The following theorem is established: 


Theorem: 3.1: Given the assumptions resulting in voting rules (3.2) and 
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(3.3), then, if the density of preferred points f(x) is normal, tlie jilat- 

form 0=8 is a dominant strategy. 

The fact that 5=8 insuT^es a candiclate of winning the election if 
the opposing candidate does not make the identical choice of 
selecting his platform to be the vector of means of the preferred 
positions, and gives the expectation of a tie if both candidates 
choose the vector of means, indicates that there should be d ten- 
dency for wise candidates to select such policies for their plat- 
forms. It is interesting to note that insofar as this tendency is ob- 
served, then the competition between candidates in a democratic 
process tends to produce the policies which a beneficent dictator 
operating under (3.1) would select. 

The above result depends upon the assumed normality of f(x). 
Since the actual population of voters in any given country is h^es- 
sarily finite, this assumption means that the presumed normal f (x) 
is an approximation to the actual density. Now for many cases this 
approximation will be suflBciently good. Further, one can argue 
that even if f(x) is not assumed to be a normal density, y cah still 
be approximated by a standard normal in many instances. Yet, one 
may wonder whether it is possible to say anything when the dis- 
tribution of preferred points f(x) is not known and no approxima- 
tions are allowed. The answer is affirmative, at least in the sense 
that certain bounds can be derived. These bounds are stated in 
terms of relative deviations from the vector 8 of the means of pre- 
ferred points, and they indicate the powerful influences of the 
means upon the policies produced by the democratic process. 

Let y and t be defined by (3.14) . By beginning with (3.3) and 
performing steps (3.5 — 3.14), it is seen easily that 

(3.20) y<t 

is equivalent to (3.3). Therefore, those voters for whom (3.20) 
obtains cast their ballots for the 2“** candidate. Without a Ipss of 
generality, consider the case in which 

(3.21) II 5,^8 II > II 5. --8 II 

so that t > 0. In other words, the 1®* candidate's platform is a 
greater ‘^distance” from the mean vector of preferred points than 
is the platform of the 2"^ candidate. Noting that E(y)=d and 
Var(y) =1, it follows from Tchebyshevs inequality that 

(3.22) P(y <t) ^1^1/t* 

since the one-sided version of this inequality cannot have a smaller 
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probability of being true than the two-sided one. Further, from 
the definition (3.14) of t, it is obvious that 


(3.23)- 


1 

4 

Sx-s,||** 

t* 

(1 Sx-S 

*_ii0,_8in* 


01 02 


must be re- 


For the purpose of the argument, 
placed by a more convenient quantity. Recall that utility is defined 
uniquely only up to a monotonic transformation. Thus it can be 
assumed, without loss of generality, that ^ If this were not 

so, then A could be multiplied by a positive scalar to make it so 
without altering any of the analysis or changing anything. It fol- 
lows that the presumption A%A < A is legitimate for the purpose 
of analysis. Thus 

(3.24) II 0,-6, 11^ ^11 0,-6, II 
follows from this assumption, the definition (3.13) of 
and the definition (2.5) of the norm 




(3.25) || 6 i-- 62 ||<|| 0 i- 8 || + ||0, 
by the triangle inequality.^^ Noting that 

(3.26) ( 01 -S II 02- 8 *)* = 

( 0,-S + 0,-8 )*( II 

one can use (3.24) and (3.25) to write 


. Also, 
-81 


(3.27) 


i(i 

|Sx-8 l-f 1 

IS.-8 

(II0X-8I1 + 


Sx-8 - 1 


01 - 811 - 1102-8 


Cancelling the common term in the numerator and denominator, 
one can use (3.27) to write (3.22) in the form 

(3.28) P(y<t)^l- ^ 

so that if 

(3.29) ||ex-8||-||0.-8|| >2V‘2 

then 

(3.30) P(y<t) >1/2 

so that the 2”* candidate receives more than one-half of the votes. 


Tchebyshev^s inequality can be stated as follows: Let z be a random variable 
with mean 8 and standard deviation (r* Then 

P(|2-8Kk)^l-cr*/k* 

where h is an arbitrary positive number. See, c.g., S. Ehrenfeld and S. Littauer, Intro- 
duction to Statistical Method^ New York: McGraw-Hill, 1964, pp. 132-133, for a 
proof of an alternative form of Tchebyshev*s inequality. 

^^An intuitive understanding of the meaning of the triangle inequality can be 
gained with recourse to the following example: Let u, r, and s denote three points in 
space. (One may think of the points u, r, and s as l^ing the three vertices of a tri- 
angle.) Then the distance between any two of the points, say u and r, must be less 
than or equal to the distance between u and s plus the distance between s and r. 
For a proof of the triangle inequality, see P. R. Halmos, Finite Dimensional Vector 
Spaces (2d ed., Princeton: D. Van Nostrand Co., 1958), pp. 125-26. 



MODELS OF THE POLmGAL SYSTEM 187 

It is obvious that if the inequality (3.21) is reversed, an argu- 
ment similar to the above one can be presented to show that if 

(3.31) ||0,~S||-||0.-6|1>2V“2 

then the candidate wins the election. The following theorem is 
established: 

Theorem: 3.2: Given the assumptions resulting in voting rules (3.2) 

and (3.3), and given that the elements of 8 and ^ are finite, then no ; 

matter what form the density f(x), (3,29) gives a bound for the 2nd 

candidate to win and ( 3.31 ) gives a bound for the 1st candidate to win. ; 

The vector 8 of the means of the policies preferred by the voters 
is a powerful influence upon the policies emerging from the com^ 
petition of the democratic process. If either candidate selects 
policies which depart radically from these means, then the othef 
candidate can win easily by choosing policies close to these meansj 
Furthermore, it should be pointed out that the Tchebyshev in-i 
equality gives a rather generous bound for most distributions. This 
generosity is increased ever further here not only by the fact that 
(3.22) is one-sided, but also by the fact that steps (3.24) and 
(3.25) are inequalities. Hence, it appears that the actual hounds 
are hkely to be narrower than the ones indicated by (3.29 ) and - 

(3.31). Once again, it appears that the competition of the demo- 
cratic process tends to force (or at least encourage) the policies 
which emerge from that process to be not too different from the 
policies which a beneficent dictator might choose. 

In fact, the influence of the vector 8 of means is even more 
powerful than might be imagined from the previous analysis. Sup- 
pose that either of the following is true: ( 1 ) The population be- 
comes more sophisticated in the sense that the number of indices 
required to characterize a given set of issues of policy increases. 
(2) The number of policy issues increases, thus causing an increase 
in the number of indices. Let the candidates choose the platforms 

(3.32) 01=6 ; 0^=8 + el, 6:^0 

where e is any non-zero scalar (no matter how small) and 

(3.33) 1 = [1, 

an n component column vector of l^s. Then as the number of in- 
dependent indices n increases, the proportion of the vote going to 
the 1®‘ candidate increases to one. The only qualification on this 
proposition is that n must increase in such a manner that the nXn 
matrix A remains positive definite. This qualification may be in- 
terpreted as meaning that no two indices measure the same poficy 
ob cteri Stic. 
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From (3.15) and (3.14) it can be seen that 

(3.34) y > - 2 'll 0 ] _ 0 , ■= * 

is equivalent to (3.2) due to the definition (3.32) of and the 
fact that II S — 6 || =0. Assuming as before (and without loss of 
generality^) that ^ ^ A"* so that A$ ^ A, dien it follows from 
the definitions (2.4) and (2.6) of tbe two norms that 

(3.35) II — 9t II* ^ II 61 — 9t II = II 02 — S II 
due to the definition (3.32) of 0i. Therefore 


(3.36) 


02-8 


2 II 01 0z 


02-8 \ 


■{1/2) V {02-8yA{02-8) 


and, noting the definition (3.32) of 02 , the ri^t hand side of 

(3.36) is equal to 

(3.37) («) Ve^l'Al = (H/2) V I'Al 

where 1 is defined by (3.33). Let e* represent the minimum eigen- 
value of the nXn matrix A.“ Then en > 0 due to the fact that the 
matrix A is assumed to be positive definite.^ Also, for any positive 
definite matrix A and any n component vector z 

(3.38) z'Az ^e. 2 z.» 

i=i 

SO that for the case in point 

(3.39) I'Al^ne. 

since the square of one is one and there are n ones in the vector 1 . 
Substituting (3.39) into the right hand side of (3.37), it is easily 
seen from (3.36) that 

Net n 00 in such a manner that the nXn matrix A remains posi- 
tive definite so that the Ca are bounded away from the origin. Then 

and from this limit, definition (3.34) of t, and relationship (3.40), 
t — 00 as n 00 . Therefore 


An eigenvalue can be defined as follows: Let B represent an nX^ matrix and 
z an n component column vector. Consider the relationship 
Bz = Xz 

where X is a scalar. An eigenvalue of the matrix B is some value of the scaler \ for 
which this relationship obtains for z ^ 0. A matrix of order n has at most n dis- 
tinguishable eigenvalues. The discussion above is concerned with the smallest of the 
eigenvalues of the matrix A. 

^®See, e.g., G. Hadley, Linear Algebra (Reading: Addison-Wesley, 1961), p. 256, 
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(3.42) P(y > t) -> 1 as n ->- 00 

where the 1 in (3.42) is the number one. The following theorem is’ 
established: 

Theorem: 3.3: Given the assumptions resulting in voting rules (3.2) 
and (3.3), and given that the platforms of the two candidates are de- 
fined by ( 3.32), then if n -^oo while the nXn matrix A remains positive I 
definite, the fraction of the total vote going to the 1st candidate ap- 
proaches one. 

This theorem indicates the power of the influence of the vectorl 
8 of means of the preferred positions. It also has a number of inter ; 
esting interpretations. One mi^t infer, for example, that as tne! 
population becomes more sophisticated in the manner in which- 
policies are viewed, and as the number of issues of policy in-| 
creases, then the chance of an extremist candidate, winning the; 
election goes down no matter what the density of preferred points. ; 

4. Candidate Selection by Primaries and a General Election, 

The analysis of Section 3 ignored the phenomenon of political^ 
parties. Certainly, the mere fact that parties select the candidates ' 
who run in the general election may place restrictions upon the' 
strategy or platform which the candidates can choose. Even 'when ! 
the terms strategy and platform (used interchangeably here) arej 
defined to mean "that for which the candidate stands” (rather* 
than the formal documents drawn up by the U.S. parties), if must ; 
be admitted that in a sense the candidate "represents” the party, i 
Consequently, it is of interest to examine a situation in wmch a * 
candidate has first to win the nomination in his own party and then 
must compete in the election on the basis of the same strategy 
(platform) which won for him the party's nomination. 

Let the totality of registered voters be divided into two mutually ' 
exclusive and exhaustive populations (parties) which are denoted | 
"one” and "two” respectively.^^ Let Wi represent the preferred I 
position of the i^ voter from Ae T* population and Vj the preferred | 
position of the voter from the 2“** population. Also, repressed I 
by fi(w) and f 3 (v) the respective densities of the preferred posi- * 
tions of the voters of the T* and 2"^ populations. The means of ; 
these densities are defined by 
(4.1) E(w)=Si ; E(v)-6. 
and the variance-covariance matrices are defined by 

*'*Note that this exhaustive division means that no independent voters are flowed. 
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( 42 ) E(W““8i)(w~8i)'=^i 

E(v^80(v-80'=5S2 

where each % is nXn. If d represents a policy vector, then let 

( 4 ^) = (Wi — 0)'A(wi — 0) 

^ ^ Laj (0) = (Vj — 0)'A(Vj — 0) 

represent the respective loss functions of the i** and j**" voters from 
the 1®* and 2“** populations. Note especially that the n X n posi- 
tive definite matrix A is commmon to all voters in both populations. 
Of course, it is important to observe that this does not prevent 
wide differences in taste from existing between the two homo- 
geneous populations since no restrictions are placed upon the 
preferred positions (the Wi and Vj) of the voters in the popula- 
tions. Differences between the two populations will be discussed 
in terms of the parameters defined by (4.1) and (4.2). Finally, it 
is assumed here, as in the previous section, that 

(4.4) 

which, as was explained earlier, is no restriction due to the fact 
that loss functions are uniquely defined only up to a monotonic 
transformation. 

The analysis here is developed under the assumption that a 
purely democratic process produces the nominations. This pre- 
sumption represents something of a departure from reality, at 
least for the U.S. where conventions have the responsibility for 
candidate selection.^* Yet, it is informative to assume that the 
candidate really does “represent” the party in the sense that he 
is the winner of an all inclusive within-party election. 

By boldly making this assumption and also by presuming that 
within any party the number of candidates is always two, the 
analysis of Section 3 can be applied to the nominations. Thus it 
is assumed that the candidates have platforms which are the 
means of the preferred points of the members of their respective 
parties. Accordingly, let and 2"*^ candidates be the respective 
nominees of the 1®* and 2"^ parties. Then 

(4.5) 0i=8i ; 0.=8. 

are presumed to be the respective platforms of the two candidates. 

There remains the problem of specifying the voters’ rule of 


The possibility of “bias” is easily seen by adopting Buchanan and Tullock’s 
argument concerning representation to the process of nomination by convention. J. M. 
Buchanan and G. Tulloch, The Calculus of Consent (Ann Arbor; University of Michi- 
gan Press, 19^2), pp. 217-22. 
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choice between these two candidates in the general election. 
Ignoring party loyalty, it is presumed that the i**^ and f individuals: 
vote for the 1®^ candidate if 

, (Wi — 0i)'A(wi — 0i) < (wi — 02)'A(wi~-02) 

^ ^ (Vi ^ 0i)'A(v, - e,) < (Vi d.yA(y, - 0,) 

holds and for the 2"^ candidate if 

(A'7\ (Wi--0i)'A(Wi~-0i) > (Wi — 02)'A(Wi — 0a) 

‘ (Vj — 0i)'A(Vj — 0i) > (vj — 02 )'A(vj — 02) 
obtains.^^ Recalling (4.5), it is clear that the voter’s choice depends i 
upon the two vectors of means 8i and 82. Since 81 ^ 82 is assumed ^ 
always, and since fi(w) and f2(v) are viewed as being continuous 
densities, there is no problem in ignoring the possibility of some 
voter being faced with equal losses from the two platforms.'^ 

Once again, it is desirable to get these voting rules into a form 
more amenable to analysis. By performing the operations exhibited 
in (3.5-3.10) and recalling that 0i=8i and 02=82 as stated by 
(4.5), one obtains 

2 (w-^80 'A(8,-~82) >-|[ 8 .- 82|P 
2(v-82)'A(8i-82) > II81-82IP 
as expressions equivalent to those of (4.6). Note that the i and j 
subscripts are omitted for convenience. It is obvious that 
(AO) E[(w-60'A(8.-&)]=0 
E[(v- 80'A(8, -801=0 
and it can be shown that 

, Var [(w - 80'A(8. - 8.)] = (8, - 80'AM(8x - 8.) 
Var [(v - 80'A(8x - 8.)]= (8i - d,yA%A{8, - 8.) 
Define as in (2.6) 

^4.11) V(8x-80'AM(S.-80= 11 8x-S.|Ix* 

V(8x - 80'AM (8i - S.) = II 8i -S. ||.* 

and 


(4.12) 


_ (w-8i)'A(8x-80 

|(8x-80|x« 

(v-80'A(8x-80 
y«= II 8.- 8. 


It might be noted that, at least with some interpretation, the voting rule need 
not conflict with the notion of party loyalty. See the discussion in Chapters 5 and I 
of A. Campbell, P. E. Converse, W. E. Miller, and D. E. Stokes, The American Voter 
(New York: Wil^, 1960). 

^^Of course, it is easy to take the equal loss possibility into account by assuming 
that in such an instance a voter will choose the candidate of his own party. The point 
is that this additional assumption does not alter the results of the analysis. 
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SO that 

(4.13) E(yx)=0 ; E(y.)=0 
and 

(4.14) Var(yO = l ; Var(yO=l 
Also define 


(4.15) 



II 8, -8.1 

l(‘ 

2 

1 [ 81 82 

11^* 



81 82 

2 

2 

81 82 

2 


It follows that 

(4.16) yi>ti ; ya > t* 

is equivalent to (4.6) . In other words, voters from the and 2“* 
parties respectively cast their ballots for the 1“^ candidate if and 
only if (4.16) obtains. 

It is necessary to obtain an expression for the portion (fraction) 
of the total vote which the candidate receives. Let a represent 
that fraction of the total number of voters belonging to the 1®* 
party. Then 1 — cx represents the fraction of the total number of 
voters belonging to the 2"* party. Imagine selecting a voter at 
random from each of the T* and 2“** populations. Then 

(4.17) R-aP(yx > to + (1 - a)P(y2 > L) 

represents the fraction of the total vote going to the candidate. 
Obviously, 1 R is the fraction of the vote going to the 2“* candi- 
date. Thus the 1®* candidate wins the election if R > K, and the 2“* 
candidate wins if R < /2. 


Recall that the norm 1 1 8i — 82 1 1 can be interpreted as the “dis- 
tance” between the mean vectors of the two populations. It is of 
interest to determine the effect of increases in this distance. 

From assumption (4.4) and the definitions (2.4) and (4.11) of 
the two types of norms under consideration here, it follows that 
(4.18) ||8,~82 ||^|| 8 ,~ 82[[.^ , k=l,2. 

This means that 


(4.19) 


2 

118,-8.1 
2 


t2 ^ 


Let distance between the two mean vectors increase. As 1 1 81 — 82 1 1 
“>oo , then from (4.19) ti — 00 and t2 so that P(yi > ti )-> 1 

and P(y2 > t8)-->-0. From (4.17), R ->oj. Thus as the distance 
between the mean vectors of the two parties increases, voters tend 
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to stick more and more with their own parties until in the limit the ; 
minority party has no chance and the majority party always wins. ; 
One can speculate that such a situation, where there are large; 
differences between the (opposing) desires of the two groups 
and the minority has no chance of exerting any influence upon ; 
policy, is not very conducive to the continuation of a democracy, i 
It is plausible to believe that conflict is likely to result and it is : 
interesting to ponder real world situations such as the Cyprus prob- 
lem in the hght of this result. 

It is appropriate to consider the relationship between the man- 
ner in which the total vote is divided and the parameters and 
% 2 , Letting 1 1 Si — 82 1 1 be a finite number, suppose that 
( 4 . 20 )%^% 

so that the 1 ®* party is allowed to represent a "wider range^" of taste 
or opinion than is the party. Granted this greater spread of 
preferred points, it is interesting to determine the conditions under 
which the 1 ®* party s candidate can win the election. 

Let both fi( w) and f 2 ( v) be multivariate normal densities. Then 
it is easily seen from the definition (4.12) that both yi and y 2 are 
normally distributed with zero means and unit variances. Define 


ki= 


(4.21) 


81 — 82 


k2^ 


81 — 82 
8i — 82 


= ->ti 


Si 


= -t2 


so that by the symmetry of the unit normal distribution 

(4is) i’<y>>*-)=i>(y.<fe) 

' P(y.>t,) = P(y.<k.) 
and (4.17) can be written equivalently 

(4.23) R=aP(yi < ki) + (1 — a)P(y, < ks) 

Note that ki > 0 so that P(yi < ki) > K and k, < 0 so that 
P(y.<k.)<K. 

If the 1“ candidate is to win the election, then it must be the 
case that his fraction of the vote is greater than one half (R > 3^). 
Making this assumption, one may obtain from (4.23) 

^ — P(ya<ks) 


>0 


P(y*<ks) 

Granted the assumption (4.20), it is easily seen from the definition 
(4.11) of the starred norms that 
(4.25) || 8 .- 8 a||.*>|| 8 ,- 8 a 
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From (4.25) and definition (4.21) one observes that ki ^ — k> so 
that 

(4.26) (Py. < k,) < P(y, < - h) =1 - P(y, < k,) 
and by substitution 

^-P(y«<fc) ^K-P(y,<kO 
^ ’ P(y.<kO-P(y,<k,)^ l-2P(y,<fc)' " 

It follows from (4.27) and (4.24) that a > K. In other words, if 
the party is more "dispersed'^ than the 2“** party in the sense that 
its members have more ivergent points of view, opinions, and de- 
sires for policies; if both parties choose candidates whose respec- 
tive platforms represent die party’s vector of means of the pre- 
ferred positions of its members; and if the densities of preferred 
positions are normal; ihen the 1®* party can win the election only 
if it is the majority party. Obviously, the converse of this state- 
ment is also true. If the T* party is a minority (a < K), and if 
^4.20) obtains, then the T* candidate loses the election. 

The above discussion makes clear the fact that the minority 
party can win under certain conditions. Therefore, one might be 
interested in determining when a minoriiy triumph can take place. 
Note that if (4.24) is true, then it is irnphed that 

(4.28) % < a:P(yi < ki) + (1 — a)P(y 2 < k®) 

so that the candidate of the 1®‘ party must win the election. There- 
fore, it is important to investigate whether and under what con- 
ditions (4.24) can obtain when a < / 2 . 

Let it be assumed that a <.% and 

(4.29) %<% 

so that the 1®* party is more ^'cohesive” than the 2”^* one in the sense 
that it represents a “smaller range” of taste and opinion about 
policy. Then the above analysis would tend to indicate that it is 
possible for the 1®* party to win the election. In order to explain 
easily why this can be true, allow the following somewhat more 
stringent, assumption to be made. 

(4.30) c®^i=^2,c > 1 

Granted condition (4.30), definitions (4.11) imply 

(4.31) c||S,-S2||x^ = ||8x-&||2* 

and applying (4.31) to definitions (4.21) yields 

(4.32) k2=-kx/c 

so that substitution is possible. Noting that ki > 0 so that P(yi < 
ki) > let c 00 . Then — ki/c 0 so that P(y 2 < — Wc) 
Applying these results to (4.24) gives 



MODELS OF THE POLTllCAL SYSTEM 


19S 


(4.33) a>- 




->0 


P(yi < ki) — P(y 2 < — ki/c) 
so that a; < J 2 is certainly possible when (4.24 ) obtains. SincQ 
(4.24) implies (4.28), the candidate of the T* party wins the| 
election. 


An intuitive understanding of the above result can be obtained 
by recourse to a simple graph. Assume a single index of a singly 
issue so that n=l. In Figure 3 the densities fi(w) and f 2 (v) are 
plotted and the means (the respective candidates^ platforms) are! 
appropriately indicated. Note that the variance of the density of! 
preferred points for the party is much smaller than the variancd 
of the 2“^ Inspection of the diagram makes clear the fact| 

that the 1®* party’s candidate will obtain the votes of almost slU 
the members of his own party and will also receive the votes of 
some members of the 2*^ party. Thus the candidate of the 1®* party* 
can win even though his party is a minority. 

It is interesting to speculate about the rise of the Nazi party in 
Germany in the light of this result. It is also interesting to consideri 
Communist Party participation in the elections in certain countries j 
in terms of this result. 


5. Platforms and the General Election 

The analysis of the above section suggests an interesting question.! 
Suppose that one or both of the parties is something less fhanj 
purely democratic in the selection of its candidate. Can the party! 
improve its chances in the general election by carefully selecting 
a candidate whose personal platform is something of a "compro-; 
mise” between the desires of the members of the candidate’s ownj 
party and those of the members of the other party? The answer I 
seems to be affirmative. Granted the existence of two populations 
(parties), this section is devoted to the demonstration of two! 
propositions, both of which depend upon normality. First, it will i 
be shown that there exists some convex combination of thp two! 
vectors of means which is at least as good as any other type of ; 
strategy. Second, it will be shown that there exists a particular 
convex combination which dominates all others. 

Let fi(w) and f 2 (v) be multivariate normal densities whose 
means are given by (4.1) and variance-covariance matrices by 
(4.2). Let (4.6) and (4.7) define the voting rule. Then for any 
platforms 0 i and 62 such that 0 i ^ 02 , it can be shown by repeating 
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steps (3.5-3.15) that voters from the respective populations choose 
the 1 “ candidate if 


_(w-80'A(6i-6,) ^ 6,-8x||*-||6.-8> 

y> 2 || 6 x- 6 ,|li* 

(v-8.)'A(6x-6,) ^ 6, -8,11* -II 6, -8, 

y* 11 6, -6, II,* 2116x-6,||.* 

where the subscriptions on v and w are omitted for convenience. 
Obviously, if one thinks of selecting a voter at random from each 
of the two populations, yi and y 2 are distributed as standard normal 
variables. Thus a sufficient condition for the 1®* candidate to win 
the election in the respective populations is 

( 5 ^) 

P(y 2 > t,) > K 

and this requires t, < 0 and < 0. From (5.1) it is clear that 
( 5 . 2 ) obtains only if 

6i — 8i|| < ||6, — 8i|| 

(5.3) '' " " 

' 6i~8,||< 116,-8,11 

so that determining that(5.3) obtains is equivalent to finding that 
the r‘ candidate will win the election. 


Consider the first proposition. Let the 1*‘ candidate choose a 
convex combination of the two vectors of means. Thus 

(5.4) 6x=i8x8x -1- ( 1 - )8x)8, , 0 < ^8, < 1 

represents a strategy which is to be shown to win or tie any non- 
convex combination 6 , chosen by the 2°^ candidate. Note specifi- 
cally that since 6 , is not a convex combination of 8 , and 
the strategies 6 ,= 8 i and 6 a= 8 , are ruled out. 

Suppose that the 2“* candidate chooses a platform 6 , such that 

(5.5) II 0.-8, 11 > II 8, -8, II 

so that the “distance” from 6 . to the mean 8 , of tibe 1 "* population 
is greater than distance between the two means. Let the 1“ candi- 
date choose j8i=0 so tibat 61 = 8 ,. Then 

(5.6) ||6x-8x|l = ||8x -8,l| < ||6,-8x|| 

so that the 1“ candidate wins in the 1“ population. Similarly, 

(5.7) II 6x - 8. II = II 8, - 8, II = 0 < II 6 8, 

so that the 1 “ candidate also wins in the 2 ”^ populatioiL 
Alternatively, suppose that the 2”^ candidate chooses a platform 
61 sudb that 

(5.8) l|6,-8,||>||8x-8. 
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Then let the 1“ candidate select /Si =1 SO that 01= 8i. Thus 

(5.9) II 01 - 8i II = II 8i - Si II = 0 < II 0, - 8i II 
so that the 1 ** candidate wins in the 1 “ population and 

(5.10) ||0,_8i|| = l|8i-8i|| < ||0i-S,l| 

so that the I** candidate also wins in the 2 ”* population. 

The above results mean that one only need to consider the case; 
in which 0 ^ is such that both 


(5.11) 


04 — 8i I j < 1 1 8i — 84 1 j 

0.-8,||<||8i-8i|| 

obtain. Accordingly, presume that the 2”^ candidate chooses a 04 ; 
such that it is not a conves combinatioa of 81 and 84 but does! 
satisfy (6.11). By manipulating (5.4) and taking norms of thej 
results one can obtain 

(512) 9.-8-ll = (l-^-)||S--8.|l 

'■ 9.-8.II =ft|16.-S,|| 

Suppose that the 1“* candidate chooses 


(5.13) j8i = 


02 82 


by (5.11). Substituting (5.13) into the 


81 - 6 ) 

and note that 0 ^ /3i < 

2 ”^ of the equalities ( 5 . 12 ), 

(5.14) II 01 - 82 II = II 04 - 82 II 

so that the 1“* and 2“^ candidates tie in the 2“* population. Sub- 
stituting (5.13) into the r‘ of the equalities (5.12), 


(5.15) II 01 - 81 1 


81 — 82 


02 — 8 * 


and noting by the triangle inequality lhat 

(5.16) II 81 - 84 II ^ II 02 - 81 II -b II - 8 . II 
so that by substituting (5.16) into (5.15) 

(5.17) 11 01 -81 IK II 02 - 81 II 

and the 1** candidate at worst ties in the 1“ population. In fact, it 
can be shown that for any / 8 i such that 

fSlSl ^ !l«>-84ll 

(5.18) n-8;--8rf] ||S,_82|| 

where 6t is given by (5.4), the 1** candidate wins or at worst ties 
in the general election. Note that, by the triangle inequality, there 
is at least one jSi in the interval. The following theorem is estab- 
lished: 


Theorem; 5.1; Given the assumptions resulting in the voting rule (4.6) 
and (4.7), given that the densities of the preferred positions of the mem- 
bers of the two populations are normal, and given that one candidate 
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selects a platform which is not a convex combination of the two vectors 
of means; then there exists a platform which is a convex combination of 
the two vectors of means such that the candidate choosing the latter 
platform will either win or tie in the general election. 


The above theorem means that the strategy of selecting as a 
platform a convex combination of the two vectors of means can be 
at least as good as any other type of platform which can be de- 
vised, Therefore, it can be argued that if both candidates are free 
to choose whatever platform they desire, each should select a con- 
vex combination of the two vectors of means. 


Suppose not only that 6i is given by (5.4) but also that the 
candidate selects a platform 
(5.19) 

so that attention is now centered on the instance in which both 
candidates have these convex combinations as their platforms. It 
is to be shown that there exists a particular convex combination 
which wins over all other convex combinations. 

In a manner similar to that in which (5.12) was obtained, one 
may manipulate (5.19) and express the results in terms of norms 
to get 

(520> = 

' ||9,-8.|| =/3.1|6.-8,|| 

By substituting (5.20) and (5.12) into the conditions (5.3) for 
the 1 ®* candidate to win in the respective populations, it is clear 
that jSi > J 82 is required in the 1 ®* population and jSi < ^2 is re- 
quired in the 2"^ population. Therefore, if the two candidates 
choose the respective platforms (5.4) and (5.19), the fact that 
the 1 ®* candidate wins in one population imphes that the 2 “^ can- 
didate wins in the other population. 

Recall that the fraction of the total vote going to the 1 ®* candi- 
date is 

(5.21) R=aP(y. > t,) + (1 - a)P(y. > U) 

By noting that yi and y. are both distributed as unit normal vari- 
ates, it follows that 


(5.22) 


P(yi > ti) = P(yi < — ti) 
P(y 2 > ts) = P(y, < — t,) 


so that 

(5.23) R=aP(yx < - 1 ) + (1 - a)P(y, < - 1 .) 
is equivalent to (5.21). 
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It is desirable to get the expressions for ti and t 2 into forms more 
suitable for analysis. By the definitions (5.4) of 0i and (5.19) of 
02 , it is easily seen that 

(5.24) 0 ^ - 02= (fix - fi2)8x + (fi2 ^ fix)82=(fix - A) (& -- &) 
so that by taking the appropriate norm 

(5.25) = || 8 x~ 82 ||,^ , r- 1,2 

where 


(5.26) II 6 i - 8 , ||r* = V( 8 i - 8 ,)'A?S^( 6 i - 8 ,) , r=l, 2. 

By recalling the definition (5.1) of h,. substituting from the 1“ of 
the equivalences of (5.12) and (5.20) for the appropriate terms 
in the numerator, and substituting (5.25) for the denominator, 


(5.27) 


t_ (2- 

ti j 


Ii8x-/3.1| 


8i-a II * 

— ^i) 1 1 Si — 82 


2l)8i-j82l ||8x-8."* 


are obtained easily after appropriate manipulation. 

It is now necessary to make an assumption concerning the rela- 
tive magnitudes of j 8 i and /S.. There are two cases to be considered. 
First, presume that j 8 i > /S. so that ( j 8 i — /S.) = j / 8 i — jS. |. Thus it 
follows from the last expression of (5.27) that 
(5.28) _tx = (l-X)si ,i 8 x>j 82 
where 

_ (^1 + A) 

, ^ 11 61-82 II’ 

81-82 

By repeating these steps with respect to ta, it is easily seen that 
( 5.30 ) — ta = — XSa , fix ^ fi 2 
where 


(5.31) Sa = 


8x 

8, 



and note that both Si and Sa are positive constants while X is a 
variable. Expressions (5.28) and (5.30) are useful for analysis. 

It is now necessary to consider the instance in which fix < jSa so 
that (fix — /Sa) — — \fii — /Sa ]. By noting the last of the expres- 
sions in (5.27), it is easily seen that 

(5.32) - ti= - (1 ^ X)si , /Si < /Sa 
and it also follows that 

( 5.33 ) — ta = XSa , fix fi 2 

tod expressions (5.32) and (5.33) are useful for analysis. 
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Define Ri(X) to be the fraction of the total vote going to die 1“ 
candidate when Then from (5.30), (5.28), and (5^23) 

(5.34) Ri(X)=aP[y»< (l-X)s,] + (l-a)P[y.<-Xs.] 
and recalling that yi and ya are unit normal variates 

Si p 

exp [ 


(5.35) = 


= — a- 


V27r 


(i-x)v , 

2 ^ 


(l-«) 


Si 


:exp [ 


XV 


]<0 


V2F''"^ ^ 2 

so it follows that Ri(X) is a monotcmically decreasing function of 
X in the interval 0 ^ X 1. 

Define Rs(X) to be the fiaction of the total vote going to the 1“ 
candidate when fit < fit. Then from (5.33), (5.32) and (5.23) 

^ (5.36) R,(X)=aP[ya < - (1 - X)sx] + (1 - a)P[y> < Xsa] 
but by noting that 

f5371 (1 — X)si] 

^ ^ P[y. < Xs.I=l - P[y, < - Xsa] 

(5.36) can be written 

(5.38) R2(X)=1-Ri(X) 

so that Ra(X) is a monotonically increasing function of X in the 
interval 0 ^ X ^ 1. 

From the monotonic properties of Ri(X) and R 2 (X), and from 
expression (5.38), it follows that there must exist a value X* of X 
such that 

(5.39) Ri(X*)=R2(X*)=K 

Now suppose that the T* candidate chooses j8i=X*. Then there 
are two cases to be examined. 

Suppose first that X* =)8i > ^ 2 . Thus 

(5.40) X=(i8i + i82)2<j8i=X* 

and as Pi > 132, Ri(X) must be examined. Since from (5.40) 
X < X*, and due to the fact that Ri(X) is a monotonically de- 
creasing function of X, 

(5.41) Ri(X) >Ri(X^)=K 

so that the 1®* candidate wins the election. 

Suppose next that X*=j8i < ^ 2 * Then 

(5.42) X=:(^, + ^,)/2>^i=X* 

and as j8i < j 8232 (X) must be examined. Since X > X* from 

(5.42), and due to the fact that R 2 (X) is a monotonically in- 
creasing fimction of X. 

(5.43) Ih(X) >R2(X*)=Ja' 

so that the 1®* candidate win^, the election. 
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Relations (5.41) and (5.43) show that the particular convex 
combination 

(5.44) (1~X^)8. 

wins over all other convex combinations. Of course, if both candi- 
dates select (5.44), then a tie is expected in the election. The 
following theorem is established; 

Theorem: 5.2: Given the assumptions resulting in the voting rule (4.6) 
and (4.7), given that the densities of the preferred positions of the 
members of the two populations are normal, and given that the two 
candidates select their platforms from the class (5.4) and (5.19), tiiien 
the platform (5.44) is a dominant strategy. 

The above theorems have an interesting implication for the pro- 
cess by which parties select candidates. From Section 3 it is clear 
that the vector of means of the preferred positions of the party 
membership exerts a powerful influence upon the platform of a 
candidate emerging from a truly representative, democratic pro- 
cess. It can be argued that in the Western World there are power- 
ful forces causing the parties to become ‘‘more democratic” in 
regard to nominations. Yet, both of the above theorems indicate 
that a party can improve its chances of winning by giving con- 
sideration to the preferences of the members of the other party. 
The “dilemma of nominations” is that if the party membership is 
not able to take a strategic point of view and if “political bosses” 
are able to take such a point of view, then having the “smoke-filled 
cloakroom nominations” may improve the party’s chances in the 
election. 

There are two unfortunate points to be made. First, the fact 
that the above theorems are separate implies that it is unknown 
whether (5.44) is a dominant strategy overall. Second, the proof 
of theorem 5.2 is not constructive in the sense that the numerical 
value of X* is unknown. Therefore, it may be useful to present a 
simple example. 

Suppose that a=K. Then it is easy to verify from (5.39) that 

(5.45) P[yx < (1 X*)s,] + P[y. < ^ X^s.] = l 

or, by noting the last of the relationships (5.37), 

(5.46) P[y. < (1 ~ X*)s,]=:P[y. < X*s.] 
so that by defining 

(5.47) \* = — ^ 

Si -(- Ss 

relationship (5.46) is satisfied. If in addition ?5i=552 so that Si=Sj, 
then =K and the dominant strategy 
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(5.48) 0* = (8i + 82)/2 

is a simple average of the two vectors of means. 


6. A Simple Extension 

All of the previous analysis presumes a certain homogeneity of 
the taste of voters in the sense that the matrix A enters all loss 
functions. While this assumption is convenient, it does not allow 
for the simple situation in which some voters do not care about 
some subset of issues. Accordingly, one situation of this type is 
considered here. 


Again let the totality of voters be divided into two mutually 
exclusive and exhaustive populations. Let Wi and Vj be n com- 
ponent vectors representing the preferred positions of the i*** voter 
in the 1®* population and the voter in the 2"*^ population respec- 
tively. Then fi(w) and f 2 (v) represent the densities of the pre- 
ferred positions. The mean vectors are given by (4.1) and the 
variance-covariance matrices by (4.2). 

Instead of using the matrix A in all loss functions, let 
. Lii( 0) = (wi — 0)'Ai( Wi — 0) 

L.i(0) = (vj-0)'A*(vj-0) 

represent the respective loss functions of the i**" and voters from 
the and 2“* populations. Suppose that both Ai and A® are singu- 
lar nXn matrices and are given by 


( 6 . 2 ) 



A.= 


O O 
O N 


where M is an mXni positive definite matrix (m < n) and N is a 
(n — m)X(n — m) positive definite matrix. Note that the specifi- 
cation of Ai means that all voters in the 1®* population obtain pos- 
sible utility losses only from the first m components of political 
choice. Therefore, these voters do not care about the last (n — m) 
components of choice. Similarly, the specification of A 2 implies 
that all voters in the 2*^ population obtain possible utility losses 
only from the last (n — m) components of political choice and do 
not care about the first m components. Note that AiA2=0. One 
might say that there is no interaction between the desires of the 
two populations. 

It might be observed that the specification (6.2) of Ai and A 2 
raises a question concerning the legitimacy of calling Wi and Vj the 
preferred positions of i^ and voters. The following develop- 
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merits should make clear the fact that the difficulty is entirely 
terminological. 

Presuming the existence of two candidates with the platforms 
01 and 02 where 0 i ^ 02, it can be shown by postulating the usual 
voting rules and repeating steps ( 3 . 5 - 3 . 15 ) that voters from the 
respective populations choose the 1 “* candidate if 


01 — 8i ||i^ — II 02 — 8i 


2 \\ 01^02 


01 — Sa 


02 82 


_(W~-SO'Ai( 01 - 02 )^ 

(Jf 

5"*“ Il0.-0.il,* 2110.-0.11.*^^ ” 

and note the subscripts on the norms in the munerators of the 
terms on the right of the inequalities. These norms must be ex- 
amined in some detail. 

By definition 

4v |l«i-S,lli^=(0x-8O'Ai(0x-SO 
^ Il 0 x-S 2 l| 2 ^=( 0 i-- 82 )'A 2 ( 0 ~ 82 ) 

but the specifications (6.2) of Ai and Aa indicates that ( 6 . 4 ) can 
be expressed in a more useful manner. Let (61 — Si)m represent 
a vector composed of the first m components of (61 — 81). 
Similarly, let (61 — 82)r represent a vector composed of the last 
r=n — m components of (61 — 82). Then (6.2) implies 
5 ) (®i - 8i)'Ai (61 - 81) - (61 - 8 i).'M( 6 i - 8 i)„ 

^ (6i-8a)'Aa(6i-8a)-(6i-8a)/N(6i-8a)r 


so that only the first m components of (61 — 81) are involved in 
the norm || 61 — 8i||i and only the last r=n — m components of 
(61 — 82) are involved in the norm ||6i — 82 [[a. By similar defini- 
tions and using the same argument 

II 62 - 81 l|i*= (62 - 81) JM (62 - 81). 

II 62-82 j|a*= (62 --82)/N(62 - 82), 

so that respectively the first m and last r components are involved 
in these norms also. 

Essentially the same phenomenon is observed in the norms m 
the denominators of the terms in (6.3). Let (61 — 6a)m represent 
a vector composed of the first m components of (61 — 62). Simi- 
larly, let (61 — 62)r represent a vector composed of the last 
r=n — m components of (61 — 62). Note that 


( 6 . 7 ) A.Mi- 



AajSaAa — 
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where %i is the mXni submatrix made up of the first m rows and 
m columns of and is the rXr submatrix made up of the last 
r rows and r columns of Thus. 

( 68 ) II I1‘* = V(da - 0.)..'M?5uM(0x - 

‘ II - ft ||.* = V(ft-ft)/N? 5 aaN(ft-ft), 

follows by definition. 

By combining ( 6 . 4 ) and ( 6 . 5 ) and substituting the result into 
( 6 . 3 ), and by substituting (6.6) and (6.8) into ( 6 . 3 ), it is easy to 
see the following fact. For all voters in the 1 ** population the 
choice of a candidate depends only upon the first m components 
of the vectors w. Si, 0 i, and 62. Similarly, for all voters in the 2”^ 
population the choice of a candidate depends only upon the last 
r— n — m components of the vectors v, S2, 0i, and 62. It also fol- 
lows that the last r components of the vectors w and the first m 
components of the vectors v can be arbitrarily specified without 
afiFecting the analysis. 

Suppose that 61 is a dominant strategy in the T* population and 
02 is a dominant platform in the 2 “^ population. (If fi( w) and f2( v) 
are multivariate normals, then 61=81 and 62=82.) Define a new 
vector 6 which is composed of the first m components of 61 and 
the last r=n — m components of 62. Then the voters of the 1 ®* 
population, for whom only the first m components are relevant, 
view 6 as identically the same as 61. Similarly, the voters in the 
2“^ population, for whom only the last r components are relevant, 
view 6 as identically the same as 62. It follows that 6 is dominant 
in both the 1®* and 2“* populations and, therefore, wins over any 
other strategy in the general election. The following theorem is 
established: 

Theorem: 6.1: Given the assumptions resulting in the voting rules (6.3) 
and the specification ( 6 . 2 ) of the matrices Aj and A^, then if 0 ^ is a 
dominant platform for the 1 ®* population and 62 Is a dominant strategy 
for the 2 “** population, the vector 0, which is composed of the first in 
components of 61 , and the last r=:n— m components of 0z> is a dominant 
strategy for the general election. 

This theorem has a rather intuitive interpretation. Given that 
one of the mutually exclusive and exhaustive groups "desires” one 
set of policies, that the other group "desires” another set of policies, 
and that there is no conflict between the two sets of policies since 
each refers to a mutually exclusive set of issues, then the politician 
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can enhance his chance of winiiing the election by giving each 
group just what it desires. 

7. Concluding Comments 

No attempt is made here to summarize the results which were 
derived in this analysis. Yet, some of the remarks of the introduc- 
tion merit repetition. There is no claim that this simple model of 
policy formation captures the anomalies of the modem, complex, 
political phenomenon. Simplifying assumptions were made to 
reduce the problem to a manageable size so that certain proposi- 
tions could be established. It is hoped that these propositions (as 
well as the analysis itself) produce insights into the complexities 
of policy formation in a democratic society. 

One additional remark is warranted. It is clear that certain 
complications, such as multi-party competition under various con- 
ditions, can be introduced and analyzed within the broad frame- 
work developed here. These additional complications, as well as 
the relaxing of certain of the assumptions, must await the results 
of future efforts. 
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Figobe 3 
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